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(1) GENERAL INFORMATION: 

(I) APPLICANT^ HUSE, WILLIAM D. 

(II) TITLE OF INVENTION: SURFACE EXPRESSION LIBRARIES OF 
HETEROMERIG RECEPTORS 

(ill) NUMBER OF SEQUENCES: 75 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEES PRETTY, SCHROEDER, BRUEGGEMANN & CLARK 
<B) STREET: 44A SO. FLOWER STREET, SUITE 200 

(C) CITY: LOS ANGELES 

(D) STATE: CALIFORNIA 

(E) COUNTRY: UNITED STATES 

(F) ZIP: 90071 \ 

(v) COMPUTER READABLE F&RM: 

(A) MEDIUM TYPE: Flippy disk 

(B) COMPUTER: IBM PC\ compatible 

(C) OPERATING SYSTEM \ PC- DOS/MS -DOS 

(D) SOFTWARE: Patentlii Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA 

(A) APPLICATION NUMBER: V~\ 

(B) FILING DATE: A 

(C) CLASSIFICATION: 

(vlii) ATTORNEY/AGENT INFORMATION :\ 

(A) NAME: CAMPBELL, CATHRYN\A. 

(B) REGISTRATION NUMBER: 31,>B15 

(C) REFERENCE/DOCKET NUMBER: \P31 8882 

(ix) TELECOMMUNICATION INFORMATION : \ 

(A) TELEPHONE: 619-535-9001 \ 

(B) TELEFAX: 619-535-8949 \ 



(2) INFORMATION FOR SEQ ID NO:l: 

(I) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 7445 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: both 

(D) TOPOLOGY: circular 



(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 1: \ 

AATGCTACTA CTATTAGTAG AATTGATGCC ACCTTTTCAG CTCGCGCCCG. AAATGAAAAT 60 

ATAGCTAAAC AGGTTATTGA CCATTTGCGA AATGTATCTA ATGGTCAAAC TAAATCTACT 120 

CGTTCGCAGA ATTGGGAATC AACTCTTACA TGGAATGAAA CTTCCAGACA CtSGTACTTTA 180 

GTTGCATATT TAAAACATGT TGAGCTACAG CACCAGATTC AGCAATTAAG CTOTAAGCCA 240 

TCTGCAAAAA TGACCTCTTA TCAAAAGGAG CAATTAAAGG TACTCTCTAA TCCTGACCTG 300 

TTGGAGTTTG CTTCCGGTCT GGTTCGCTTT GAAGCTCGAA TTAAAACGCG ATATTTGAAG 360 

TCTTTCGGGC TTCCTCTTAA TCTTTTTGAT GCAATCCGCT TTGCTTCTGA CTATAaWt 420 
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CAGGGTAAAG ACCTG^TTTT TGATTTATGG TCATTCTCGT 


TTTCTGAACT 


GTTTAAAGCA 


480 


TTTGAGCGGG ATTCAATCAA TATTTATGAC GATTCCGCAG 


TATTGGACGC 


TATCCAGTCT 


540 


AAACATTTTA CTATTACCCC CTCTGGCAAA ACTTCTTTTG 


CAAAAGCCTC 


TCGCTATTTT 


600 


GGTTTTTATC GTCGTCTGGT AAACGAGGGT TATGATAGTG 


TTGCTCTTAC 


TATGCCTCGT 


660 


AATTCCTTTT GGCGTTATGT\ATCTGCATTA GTTGAATGTG 


GTATTCCTAA 


ATCTCAACTG 


720 


ATGAATCTTT CTACCTGTAA TAATGTTGTT CCGTTAGTTC 


GTTTTATTAA 


CGTAGATTTT 


780 


TCTTCCCAAC GTCCTGACTG GTATAATGAG CCAGTTCTTA 


AAATCGCATA 


AGGTAATTCA 


840 


CAATGATTAA AGTTGAAATT AAACCATCTC AAGCCCAATT 


TACTACTCGT 


TCTGGTGTTT 


900 


CTCGTCAGGG CAAGCCTTAT TCACTGAATG AGCAGCTTTG 


TTACGTTGAT 


TTGGGTAATG 


960 


AATATCCGGT TCTTGTCAAG ATTACTGTTG ATGAAGGTCA 


GCCAGCCTAT 


GCGCCTGGTC 


1020 


TGTACACCGT TCATCTGTCC TCTTTCAaVg TTGGTCAGTT 


CGGTTCCCTT 


ATGATTGACC 


1080 


GTCTGCGCCT CGTTCCGGCT AAGTAACATQ GAGCAGGTCG 


CGCATTTCGA 


CACAATTTAT 


1140 


CAGGCGATGA TACAAATCTC CGTTGTACTT NpGTTyCGCGC 


TTGGTATAAT 


CGCTGGGGGT 


1200 


CAAAGATGAG TGTTTTAGTG TATTCTTTc/ CCTCTTXCGT 


TTTAGGTTGG 


TGCCTTCGTA 


1260 


GTGGCATTAC GTATTTTACC CGTTTAATGGAAACTTCCTC 


ATGAAAAAGT 


CTTTAGTCCT 


1320 


CAAAGCCTCT GTAGCCGTTG CTACCCTCGT TCCGATGCTG 


TCTTTCGCTG 


CTGAGGGTGA 


1380 


CGATCCCGCA AAAGCGGCCT TTAACTCCCT GCAAGfiCTCA 


GCGACCGAAT 


ATATCGGTTA 


1440 


TGCGTGGGCG ATGGTTGTTG TCATTGTCGG CGCAACTATC 


GGTATCAAGC 


TGTTTAAGAA 


1500 


ATTCACCTCC AAAGCAAGCT GATAAACCGA TACAATTASAA 


GGCTCCTTTT 


GGAGCCTTTT 


1560 


TTTTTGGAGA TTTTCAACGT GAAAAAATTA TTATTCGCAA 


TTCCTTTAGT 


TGTTCCTTTC 


1620 


TATTCTCACT CCGCTGAAAC TGTTGAAAGT TGTTTAGCAA 


yVACCCCATAC 


AGAAAATTCA 


1680 


TTTACTAACG TCTGGAAAGA CGACAAAACT TTAGATCGTT 


ACGCTAACTA 


TGAGGGTTGT 


1740 


CTGTGGAATG CTACAGGCGT TGTAGTTTGT ACTGGTGACG 


aaVctcagtg 


TTACGGTACA 


1800 


TGGGTTCCTA TTGGGCTTGC TATCCCTGAA AATGAGGGTG 


GTGGCTCTGA 


GGGTGGCGGT 


1860 


TCTGAGGGTG GCGGTTCTGA GGGTGGCGGT ACTAAACCTC 


CTGAGTACGG 


TGATACACCT 


1920 


ATTCCGGGCT ATACTTATAT CAACCCTCTC GACGGCACTT 


ATCCGCCTGG 


TACTGAGCAA 


1980 


AACCCCGCTA ATCCTAATCC TTCTCTTGAG GAGTCTCAGC 


CTCTTAATAC 


TTTCATGTTT 


2040 


CAGAATAATA GGTTCCGAAA TAGGCAGGGG GCATTAACTG 


TTTATACGGG 


CACTGTTACT 


2100 


CAAGGCACTG ACCCCGTTAA AACTTATTAC CAGTACACTC 


CTGTATCATO 


i AAAAGCCATG 


2160 


TATGACGCTT ACTGGAACGG TAAATTCAGA GACTGCGCTT 


TCCATTCTGG 


GTTTAATGAA 


2220 


GATCCATTCG TTTGTGAATA TCAAGCCCAA TCGTCTGACC 


TGCCTCAACC 


TOCTGTCAAT 


2280 


GCTGGCGGCG GCTCTGGTGG TGGTTCTGGT GGCGGCTCTG 


AGGGTGGTGG 


CTCTGAGGGT 


2340 


GGCGGTTCTG AGGGTGGCGG CTCTGAGGGA GGCGGTTCCG 


GTGGTGGCTC 


TGGTTSCCGGT 


2400 


GATTTTGATT ATGAAAAGAT GGCAAACGCT AATAAGGGGG 


CTATGACCGA 


AAATGCCGAT 


2460 
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GAAAACGCGC 


TACAGTCTGA 


CGCTAAAGGC AAACTTGATT 


CTCTCCCTAC 


TGATTACGGT 


2520 


GCTGCTATCG 


ATGGTTTOAT 


TGGTGACGTT TCCGGCCTTG 


CTAATGGTAA 


TGGTGCTACT 


2580 


GGTGATTTTG 


CTGGCTCTAA. 


TTCCCAAATG GCTCAAGTCG 


GTGACGGTGA 


TAATTCACCT 


2640 


TTAATGAATA 


ATTTCCGTCA 


Vtatttacct tccctccctc 


AATCGGTTGA 


ATGTCGCCCT 


2700 


TTTGTCTTTA 


GCGCTGGTAA 


AO.CATATGAA TTTTCTATTG 


ATTGTGACAA 


AATAAACTTA 


2760 


TTCCGTGGTG 


TCTTTGCGTT 


TCTOTTATAT GTTGCCACCT 


TTATGTATGT 


ATTTTCTACG 


2820 


TTTGCTAACA 


TACTGCGTAA 


TAAGCAGTCT TAATCATGCC 


AGTTCTTTTG 


GGTATTCCGT 


2880 


TATTATTGCG 


TTTCCTCGGT 


TTCCTTCTGG TAACTTTGTT 


CGGCTATCTG 


CTTACTTTTC 


2940 


TTAAAAAGGG 


CTTCGGTAAG 


ATAGCTATTG CTATTTCATT 


GTTTOTTGCT 


CTTATTATTG 


3000 


GGCTTAACTC 


AATTCTTGTG 


GGTTATCTCT CTGATATTAG 


CGCTCAATTA 


CCCTCTGACT 


3060 


TTGTTCAGGG 


TGTTCAGTTA 


attctcccgV ctaatgcgct 


TCCCTGTTTT 


TATGTTATTC 


3120 


TCTCTGTAAA 


GGCTGCTATT 


ttcatttttgVcgttaaaca 


AAAAATCGTT 


TCTTATTTGG 


3180 


ATTGGGATAA 


ATAATATGGC 


TGTTTATTTT GT£A€TGGCA 


AATTAGGCTC 


TGGAAAGACG 


3240 


CTCGTTAGCG 


TTGGTAAGAT 


tcaggataaa / attgtagctg 


GGTGCAAAAT 


AGCAACTAAT 


3300 


CTTGATTTAA 


GGCTTCAAAA 


CCTCCCGCAA^GICgWgGT 


TCGCTAAAAC 


GCCTCGCGTT 


3360 


CTTAGAATAC 


CGGATAAGCC 


TTCTATATCT GATTTCCTTG 


CTATTGGGCG 


CGGTAATGAT 


3420 


TCCTACGATG 


AAAATAAAAA 


CGGCTTGCTT GTTCTCGATG 


AGTGCGGTAC 


TTGGTTTAAT 


3480 


ACCCGTTCTT 


GGAATGATAA 


GGAAAGACAG CCGATTATTIG 


ATTGGTTTCT 


ACATGCTCGT 


3540 


AAATTAGGAT 


GGGATATTAT 


TTTTCTTGTT CAGGACTTAT\ 


CTATTGTTGA 


TAAACAGGCG 


3600 


CGTTCTGCAT 


TAGCTGAACA 


TGTTGTTTAT TGTCGTCGTC 


Vggacagaat 


TACTTTACCT 


3660 


TTTGTCGGTA 


CTTTATATTC 


TCTTATTACT GGCTCGAAAA 


TGGCTCTGCC 


TAAATTACAT 


3720 


GTTGGCGTTG 


TTAAATATGG 


CGATTCTCAA TTAAGCCCTA 


ctcYtgagcc 


TTCGCTTTAT 


3780 


ACTGGTAAGA 


ATTTGTATAA 


CGCATATGAT ACTAAACAGG 


CTTTTTCTAG 


TAATTATGAT 


3840 


TCCGGTGTTT ATTCTTATTT 


AACGCCTTAT TTATCACACG 


gtcggtVitt 


CAAACCATTA 


3900 


AATTTAGGTC AGAAGATGAA 


GCTTACTAAA ATATATTTGA 


AAAAGTTTTC 


ACGCGTTCTT 


3960 


TGTCTTGCGA TTGGATTTGC 


ATCAGCATTT ACATATAGTT 


ATATAACCCX 


ACCTAAGCCG 


4020 


GAGGTTAAAA AGGTAGTCTC 


TCAGACCTAT GATTTTGATA 


AATTCACTAT 


XTGACTCTTCT 


4080 


CAGCGTCTTA ATCTAAGCTA 


TCGCTATGTT TTCAAGGATT 


CTAAGGGAAA 


aVtaattaat 


4140 


AGCGACGATT TACAGAAGCA 


AGGTTATTCA CTCACATATA 


TTGATTTATG 


TAOTGTTTCC 


4200 


ATTAAAAAAG 


GTAATTCAAA 


TGAAATTGTT AAATGTAATT 


AATTTTGTTT 


TCTTGATGTT 


4260 


TGTTTCATCA TCTTCTTTTG 


CTCAGGTAAT TGAAATGAAT 


AATTCGCCTC 


TGCGCGATTT 


4320 


TGTAACTTGG 


TATTCAAAGC 


AATCAGGCGA ATCCGTTATT 


GTTTCTCCCG 


ATGTAAAAGG 


4380 


TACTGTTACT 


GTATATTCAT 


CTGACGTTAA ACCTGAAAAT 


CTACGCAATT 


TCTTTATTTC 


4440 


TGTTTTACGT 


GCTAATAATT 


TTGATATGGT TGGTTCAATT 


CCTTCCATAA 


ttcagaagtaX 


4500 
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TAATCCAAAC 


AATCAGGATT 


ATATTGATGA 


ATTGCCATCA 


TCTGATAATC 


AGGAATATGA 


4560 


TGATAATTCC 


GCTCCTOCTG 


GTGGTTTCTT 


TGTTCCGCAA 


AATGATAATG 


TTACTCAAAC 


4620 


TTTTAAAATT 


aataacgVtc 


GGGCAAAGGA 


TTTAATACGA 


GTTGTCGAAT 


TGTTTGTAAA 


4680 


GTCTAATACT 


TCTAAATCCT 


CAAATGTATT 


ATCTATTGAC 


GGCTCTAATC 


TATTAGTTGT 


4740 


TAGTGCACCT 


AAAGATATTO 


TAGATAACCT 


TCCTCAATTC 


CTTTCTACTG 


TTGATTTGCC 


4800 


AACTGACCAG 


ATATTGATTG\ 


AGGGTTTGAT 


ATTTGAGGTT 


CAGCAAGGTG 


ATGCTTTAGA 


4860 


TTTTTCATTT 


GCTGCTGGCT 


ETCAGCGTGG 


CACTGTTGCA 


GGCGGTGTTA 


ATACTGACCG 


4920 


CCTCACCTCT 


GTTTTATCTT 


CT5GCTGGTGG 


TTCGTTCGGT 


ATTTTTAATG 


GCGATGTTTT 


4980 


AGGGCTATCA 


GTTCGCGCAT 


taaVgactaa 


TAGCCATTCA 


AAAATATTCT 


CTGTGCCACG 


5040 


TATTCTTACG 


CTTTCAGGTC 


AGAAGGGTTC 


TATCTCTGTT 


GGCCAGAATG 


TCCCTTTTAT 


5100 


TACTGGTCGT 


GTGACTGGTG 


aatctgVcaa 


TGTAAATAAT 


CCATTTCAGA 


CGATTGAGCG 


5160 


TCAAAATGTA 


GGTATTTCCA 


TGAGCGTTTT 


TCCTGTTGCA 


ATGGCTGGCG 


GTAATATTGT 


5220 


TCTGGATATT 


ACCAGCAAGG 


ccgatagtA, 


/CXSi'i'crrcT 


ACTCAGGCAA 


GTGATGTTAT 


5280 


TACTAATCAA 


AGAAGTATTG 


CTACAACOGT 


Vaatttgcgt 


GATGGACAGA 


CTCTTTTACT 


5340 


CGGTGGCCTC 


ACTGATTATA 


AAAACACTTC 


'tSaagattct 


GGCGTACCGT 


TCCTGTCTAA 


5400 


AATCCCTTTA 


ATCGGCCTCC 


TGTTTAGCTC 


ccgVtctgat 


TCCAACGAGG 


AAAGCACGTT 


5460 


ATACGTGCTC 


GTCAAAGCAA 


CCATAGTACG 


cgccotgtag 


CGGCGCATTA 


AGCGCGGCGG 


5520 


GTGTGGTGGT 


TACGCGCAGC 


GTGACCGCTA 


cacttgWg 


CGCCCTAGCG 


CCCGCTCCTT 


5580 


TCGCTTTCTT 


CCCTTCCTTT 


CTCGCCACGT 


TCGCCGGOTT 


TCCCCGTCAA 


GCTCTAAATC 


5640 


GGGGGCTCCC 


TTTAGGGTTC 


CGATTTAGTG 


CTTTACGGCA 


CCTCGACCCC 


AAAAAACTTG 


5700 


ATTTGGGTGA 


TGGTTCACGT 


AGTGGGCCAT 


CGCCCTGATA 


^GACGGTTTTT 


CGCCCTTTGA 


5760 


CGTTGGAGTC 


CACGTTCTTT 


AATAGTGGAC 


TCTTGTTCCA 


AACTGGAACA 


ACACTCAACC 


5820 


CTATCTCGGG 


CTATTCTTTT 


GATTTATAAG 


GGATTTTGCC 


GATTTCGGAA 


CCACCATCAA 


5880 


ACAGGATTTT 


CGCCTGCTGG 


GGCAAACCAG 


CGTGGACCGC 


TTGCTGCAAC 


TCTCTCAGGG 


5940 


CCAGGCGGTG 


AAGGGCAATC 


AGCTGTTGCC 


CGTCTCGCTG 




AAACCACCCT 


6000 


GGCGCCCAAT 


ACGCAAACCG 


CCTCTCCCCG 


CGCGTTGGCC 


GATTCATTAA 


TGCAGCTGCC 


6060 


ACGACAGGTT 


TCCCGACTGG 


AAAGCGGGCA 


GTGAGCGCAA 


cgcaattaaV 


GTGAGTTAGC 


6120 


TCACTCATTA 


GGCACCCCAG 


GCTTTACACT 


TTATGCTTCC 


GGCTCGTATG 




6180 


TTGTGAGCGG 


ATAACAATTT 


CACACGCGTC 


ACTTGGCACT 


GGCCGTCGTT 


TTTACAACGTC 


6240 


GTGACTGGGA 


AAACCCTGGC 


GTTACCCAAG 


CTTTGTACAT 


GGAGAAAATA 


AAGTGAAACA 


6300 


AAGCACTATT 


GCACTGGCAC 


TCTTACCGTT 


ACCGTTACTG 


TTTACCCCTG 


TGAOUAAGC 


6360 


CGCCCAGGTC 


CAGCTGCTCG 


AGTCAGGCCT 


ATTGTGCCCA 


GGGGATTGTA 


CTAGTGGATC 


6420 


CTAGGCTGAA 


GGCGATGACC 


CTGCTAAGGC 


TGCATTCAAT 


AGTTTACAGG 


CAAGTGCTTAC 


6480 


TGAGTACATT 


GGCTACGCTT 


GGGCTATGGT 


AGTAGTTATA 


GTTGGTGCTA 


CCATAGGGAT 


6540 
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TAAATTATTC 


AAAAAGTTTA 


CGAGCAAGGC 


TTCTTAAGCA ATAGCGAAGA 


GGCCCGCACC 


6600 


GATCGCCCTT 


CCCAACAOTT 


GCGCAGCCTG 


AATGGCGAAT GGCCCTTTGC 


CTGGTTTCCG 


6660 


GCACCAGAAG 


cggtgccggV 


AAGCTGGCTG 


GAGTGCGATC TTCCTGAGGC 


CGATACGGTC 


6720 


GTCGTCCCCT 


CAAACTGGCA 


^GATGCACGGT 


TACGATGCGC CCATCTACAC 


CAACGTAACC 


6780 


TATCCCATTA 


CGGTCAATCC 


GCCGTTTGTT 


CCCACGGAGA ATCCGACGGG 


TTGTTACTCG 


6840 


CTCACATTTA 


ATGTTGATGA 


AAGCTGGCTA 


CAGGAAGGCC AGACGCGAAT 


TATTTTTGAT 


6900 


GGCGTTCCTA 


TTGGTTAAAA 


AATGAGCTGA 


TTTAACAAAA ATTTAACGCG 


AATTTTAACA 


6960 


AAATATTAAC 


GTTTACAATT 


TAAATSATTTG 


CTTATACAAT CTTCCTGTTT 


TTGGGGCTTT 


7020 


TCTGATTATC 


AACCGGGGTA 


CATATOATTG 


ACATGCTAGT TTTACGATTA CCGTTCATCG 


7080 


ATTCTCTTGT 


TTGCTCCAGA 


CTCTCAGGCA 


ATGACCTGAT AGCCTTTGTA 


GATCTCTCAA 


7140 


AAATAGCTAC 


CCTCTCCGGC 


ATTAATTTAT 


CAGCTAGAAC GGTTGAATAT 


CATATTGATG 


7200 


GTGATTTGAC 


TGTCTCCGGC 


CTTTCTCAC6; 


CTTTTGAATC TTTACCTACA 


CATTACTCAG 


7260 


GCATTGCATT 


TAAAATATAT 


GAGGGTTCTA 


\aaaattttta TCCTTGCGTT 


GAAATAAAGG 


7320 


crrcrcccGc 


AAAAGTATTA 


CAGGGTCATA 


^GTTTTTGG TACAACCGAT 


TTAGCTTTAT 


7380 


GCTCTGAGGC 


TTTATTGCTT 


AATTTTGCTA 


S A5P5€TTTGCC TTGCCTGTAT 


GATTTATTGG 


7440 



ACGTT \ 7445 

(2) INFORMATION FOR SEQ ID NO: 2: \ 

(i) SEQUENCE CHARACTERISTICS: \ 

(A) LENGTH: 7317 base pairs \ 

(B) TYPE: nucleic acid \ 

(C) STRAND EDNESS : both \ 

(D) TOPOLOGY: circular \ 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 



AATGCTACTA CTATTAGTAG AATTGATGCC 


ACCTTTTCAG 


CTfcCCGCCCC 


AAATGAAAAT 


60 


ATAGCTAAAC AGGTTATTGA CCATTTGCGA 


AATGTATCTA 


ATGOTCAAAC 


TAAATCTACT 


120 


CGTTCGCAGA ATTGGGAATC AACTGTTACA 


TGGAATGAAA 


cttcc\gaca 


CCGTACTTTA 


180 


GTTGCATATT TAAAACATGT TGAGCTACAG 


CACCAGATTC 


AGCAATTAAG 


CTCTAAGCCA 


240 


TCCGCAAAAA TGACCTCTTA TCAAAAGGAG 


CAATTAAAGG 


TACTCTCTAA 


TCCTGACCTG 


300 


TTGGAGTTTG CTTCCGGTCT GGTTCGCTTT 


GAAGCTCGAA 


TTAAAACGCG. 


ATATTTGAAG 


360 


TCTTTCGGGC TTCCTCTTAA TCTTTTTGAT 


GCAATCCGCT 


TTGCTTCTGA 


Wataatagt 


420 


CAGGGTAAAG ACCTGATTTT TGATTTATGG 


TCATTCTCGT 


TTTCTGAACT 


gVttaaagca 


480 


TTTGAGGGGG ATTCAATGAA TATTTATGAC 


GATTCCGCAG 


TATTGGACGC 


taVccagtct 


540 


AAACATTTTA CTATTACCCC CTCTGGCAAA 


ACTTCTTTTG 


CAAAAGCCTC 


tcgWatttt 


600 


GGTTTTTATC GTCGTCTGGT AAACGAGGGT 


TATGATAGTG 


TTGCTCTTAC 


tatgcctcgt 


660 


AATTCCTTTT GGCGTTATGT ATCTGCATTA 


GTTGAATGTG 


GTATTCCTAA 


atctcaiactg 


720 
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ATGAATCTTT 


CTACCTGTAA TAATGTTGTT 


CCGTTAGTTC 


GTTTTATTAA 


CGTAGATTTT 


780 


TCTTCCCAAC 


GTCCTGACTG GTATAATGAG 


CCAGTTCTTA 


AAATCGCATA 


AGGTAATTCA 


840 


CAATGATTAA 


AGTTGAAATT aaaccatctc 


AAGCCCAATT 


TACTACTCGT 


TCTGGTGTTT 


900 


CTCGTCAGGG 


caagccttatYtcactgaatg 


AGCAGCTTTG 


TTACGTTGAT 


TTGGGTAATG 


960 


AATATCCGGT 


tcttgtcaag Xttactcttg 


ATGAAGGTCA 


GCCAGCCTAT 


GCGCCTGGTC 


1020 


TGTACACCGT 


TCATCTGTCC TQTTTCAAAG 


TTGGTCAGTT 


CGGTTCCCTT 


ATGATTGA6C 


1080 


GTCTGCGCCT 


CGTTCCGGCT AAOTAACATG 


GAGCAGGTCG 


CGGATTTCGA 


CACAATTTAT 


1140 


CAGGCGATGA 


TACAAATCTC CGTTGTACTT 


TGTTTCGCGC 


TTGGTATAAT 


CGCTGGGGGT 


1200 


CAAAGATGAG 


TGTTTTAGTG TATTCTTTCG 


CCTCTTTCGT 


TTTAGGTTGG 


TGCCTTCGTA 


1260 


GTGGCATTAC 


GTATTTTACC CGTTTAATGG 


AAACTTCCTC 


ATGAAAAAGT 


CTTTAGTCCT 


1320 


CAAAGCCTCT 


GTAGCCGTTG CTACCCTCGT 


TCCGATGCTG 


TCTTTCGCTG 


CTGAGGGTGA 


1380 


CGATCCCGCA 


AAAGCGGCCT TTAACTCCCT 


GCAAGCCTCA 


GCGACCGAAT 


ATATCGGTTA 


1440 


TGCGTGGGCG 


ATGGTTGTTG TCATTGTCGG 


CG£AACTATC 


GGTATCAAGC 


TGTTTAAGAA 


1500 


ATTCACCTCG 


AAAGCAAGCT GATAAACCcA 


TACAATTAAA 


GGCTCCTTTT 


GGAGCCTTTT 


1560 


TTTTTGGAGA 


TTTTCAACGT GAAAAAAOA 


Vtattcgcaa 


TTCCTTTAGT 


TGTTCCTTTC 


1620 


TATTCTCACT 


CCGCTGAAAC TGTTGAAAGT 


Tetttagcaa 


AACCCCATAC 


AGAAAATTCA 


1680 


TTTACTAACG 


TCTGGAAAGA CGACAAAACT 


TTAGATCGTT 


ACGCTAACTA 


TGAGGGTTGT 


1740 


CTGTGGAATG 


CTACAGGCGT TGTAGTTTGT 


ACTGGTGACG 


AAACTCAGTG 


TTACGGTACA 


1800 


TGGGTTCCTA 


TTGGGCTTGC TATCCCTGAA 


AATQAGGGTG 


GTGGCTCTGA 


GGGTGGCGGT 


1860 


TCTGAGGGTG 


GCGGTTCTGA GGGTGGCGGT 


actaaVcctc 


CTGAGTACGG 


TGATACACCT 


1920 


ATTCCGGGCT 


ATACTTATAT CAACCCTCTC 


GACGGCACTT 


ATCCGCCTGG 


TACTGAGCAA 


1980 


AACCCCGCTA ATCCTAATCC TTCTCTTGAG 


GAGTCTCAGC 


CTCTTAATAC 


TTTCATGTTT 


2040 


CAGAATAATA 


GGTTCCGAAA TAGGCAGGGG 


GCATTAACTfe 


TTTATACGGG 


CACTGTTACT 


2100 


CAAGGCACTG ACCCCGTTAA AACTTATTAC 


CAGTACACTC 


VCTGTATCATC 


AAAAGCCATG 


2160 


TATGACGCTT ACTGGAACGG TAAATTCAGA 


GACTGCGCTT 


TGCATTCTGG 


CTTTAATGAA 


2220 


GATCCATTCG 


TTTGTGAATA TCAAGGCCAA 


TCGTCTGACC 


TG(SCTCAACC 


TCCTGTCAAT 


2280 


GCTGGCGGCG 


GCTCTGGTGG TGGTTCTGGT 


GGCCCCTCTG 


AGGCTGGTGG 


CTCTGAGGGT 


2340 


GGCGGTTCTG AGGGTGGCGG CTCTGAGGGA 


GGCGGTTCCG 


GTGGTCGCTC 


TGGTTCCGGT 


2400 


GATTTTGATT ATGAAAAGAT CGCAAACGCT 


AATAAGGGGG 


CTATGAGCGA 


AAATGCCGAT 


2460 


GAAAACGCGC 


TACAGTCTGA CGCTAAAGGC 


AAACTTGATT 


CTCTCGCTAC 


TGATTACGGT 


2520 


GCTGCTATCG ATGGTTTCAT TGGTGACGTT 




CTAATGGTAA 


TGGTGCTACT 


2580 


GGTGATTTTG 


CTGGCTCTAA TTCCCAAATG 


GCTCAAGTCG 


ctgacggtgA 


TAATTCACCT 


2640 


TTAATGAATA ATTTCCGTCA ATATTTACCT 


TCCCTCCCTC 


AATCCGTTGA 


Vtctcggcct 


2700 


TTTGTCTTTA GCGCTGGTAA ACCATATGAA 


TTTTCTATTG 


ATTGTGACAA 


/Utaaactta 


2760 
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TTCCGTGGTG 


TCTTTGCGTT 


TCTTTTATAT GTTGCCACCT 


TTATGTATGT 


ATTTTCTACG 


2820 


TTTGCTAACA 


TACTGCCMA 


TAAGGAGTCT TAATCATGCC 


AGTTCTTTTG 


GGTATTCCGT 


2880 


TATTATTGCG 


tttcctcggt 


TTCCTTCTGG TAACTTTGTT 


CGGCTATCTG 


CTTACTTTTC 


2940 


TTAAAAAGGG 


CTTCGGTAAfi 


ATAGCTATTG CTATTTCATT 


GTTTCTTGCT 


CTTATTATTG 


3000 


GGCTTAACTC 


aattcttgtg\ 


GGTTATCTCT CTGATATTAG 


CGCTCAATTA 


CCCTCTGACT 


3060 


TTGTTCAGGG 


TGTTCAGTTA 


Vttctcccgt ctaatgcgct 


TCCCTGTTTT 


TATGTTATTC 


3120 


TCTCTGTAAA 


GGCTGCTATT 


TTCATTTTTG ACGTTAAACA 


AAAAATCGTT 


TCTTATTTGG 


3180 


ATTGGGATAA 


ATAATATGGC 


tgYttatttt GTAACTGGCA 


AATTAGGCTC 


TGGAAAGACG 


3240 


CTCGTTAGCG 


TTGGTAAGAT 


TCACGATAAA ATTGTAGCTG 


GGTGCAAAAT 


AGCAACTAAT 


3300 


CTTGATTTAA 


GGCTTCAAAA 


CCTCCCGCAA GTCGGGAGGT 


TCGCTAAAAC 


GCCTCGCGTT 


3360 


CTTAGAATAC 


CGGATAAGCC 


TTCTATATCT GATTTGCTTG 


CTATTGGGCG 


CGGTAATGAT 


3420 


TCCTACGATG 


AAAATAAAAA 


CGGCTTGCTT GTTCTCGATG 


AGTGCGGTAC 


TTGGTTTAAT 


3480 


ACCCGTTCTT 


GGAATGATAA 


GGAAAGACAC— €CGATTATTG 


ATTGGTTTCT 


ACATGCTCGT 


3540 


AAATTAGGAT 


GGGATATTAT 


TTTTCZTGTT CAGGACTTAT 


CTATTGTTGA 


TAAACAGGCG 


3600 


CGTTCTGCAT 


TAGCTGAACA 


TGTTGTnAT-TCfCGTCGTC 


TGGACAGAAT 


TACTTTACCT 


3660 


TTTGTCGGTA 


CTTTATATTC 


tcttattacAgcctcgaaaa 


TGCCTCTGCC 


TAAATTACAT 


3720 


GTTGGCGTTG 


TTAAATATGG 


CGATTCTCAA TTAAGCCCTA 


CTGTTGAGCG 


TTGGCTTTAT 


3780 


ACTGGTAAGA 


ATTTGTATAA 


CGCATATGAT AffTAAACAGG 


CTTTTTCTAG 


TAATTATGAT 


3840 


TCCGGTGTTT ATTCTTATTT 


AACGCCTTAT TTATCACACG 


GTCGGTATTT 


CAAACCATTA 


3900 


AATTTAGGTC 


AGAAGATGAA 


GCTTACTAAA ATATATTTGA 


AAAAGTTTTC 


ACGCGTTCTT 


3960 


TGTCTTGCGA 


TTGGATTTGC 


ATCAGCATTT ACATATAGTT 


ATATAACCCA 


ACCTAAGCCG 


4020 


GAGGTTAAAA AGGTAGTCTC 


TCAGACCTAT GATTTT(5ATA 


AATTCACTAT 


TGACTCTTCT 


4080 


CAGCGTCTTA 


ATCTAAGCTA 


TCGCTATCTT TTCAAGGATT 


CTAAGGGAAA 


ATTAATTAAT 


4140 


AGCGACGATT TACAGAAGCA 


AGGTTATTCA CTCACATATA. 


TTGATTTATG 


TACTGTTTCC 


4200 


ATTAAAAAAG GTAATTCAAA 


TGAAATTGTT AAATGTAATT\ 


AATTTTttrrr 


TCTTGATGTT 


4260 


TGTTTCATCA TCTTCTTTTG 


CTCAGGTAAT TGAAATGAAT 


Vattcgcctc 


TGCGCGATTT 


4320 


TGTAACTTGG TATTCAAAGC 


AATCAGGCGA ATCCGTTATT 


GOTTCTCCCG 


ATGTAAAAGG 


4380 


TACTGTTACT GTATATTCAT 


CTGACGTTAA ACCTGAAAAT 


CTACGCAATT 


TCTTTATTTC 


4440 


TGTTTTACGT GCTAATAATT 


TTGATATGGT TGGTTCAATT 


CCTTCCATAA 


TTCAGAAGTA 


4500 


TAATCCAAAC AATCAGGATT 


ATATTGATGA ATTGCCATCA 


TCTGATAATC 


AGGAATATGA 


4560 


TGATAATTCC 


GCTCCTTCTG 


GTGGTTTCTT TGTTCCGCAA 


AATGlTAATG 


TTACTCAAAC 


4620 


TTTTAAAATT AATAACGTTC 


GGGCAAAGGA TTTAATACGA 


GTTGTCGAAT 


TGTTTGTAAA 


4680 


GTCTAATACT TCTAAATCCT 


CAAATGTATT ATCTATTGAC 


GGCTCTOATC 


TATTAGTTGT 


4740 


TAGTGCACCT AAAGATATTT 


TAGATAACCT TCCTCAATTC 


CTTTCTACTG 


TTGATTTGCC 


4800 
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AACTGACCAG ATATTGATTG 


AGGGTTTGAT 


ATTTGAGGTT 


CAGCAAGGTG 


ATGCTTTAGA 


4860 


TTTTTCATTT GCTGCTGGCT 


CTCAGCGTGG 


CACTGTTGCA 


GGCGGTGTTA 


ATACTGACCG 


4920 


CCTCACCTCT GTTTTATCTT\ 


CTGCTGGTGG 


TTCGTTCGGT 


ATTTTTAATG 


GCGATGTTTT 


4980 


AGGGCTATCA GTTCGCGCAT 


TAAAGACTAA 


TAGCCATTCA 


AAAATATTGT 


CTGTGCCACG 


5040 


lAlit/HAt/0 CTTTCAGGTC 


AGAAGGGTTC 


TATCTCTGTT GGCCAGAATG 


TCCCTTTTAT 


S100 


TACTGGTCGT GTGACTGGTG 


AATCTGCCAA 


IviAAAlAAi 


CPt'lTITtrA 
w W\l 1 1 liAun 


CGATTGAGPG 


5160 


TCAAAATGTA GGTATTTCCA 


TGAGCGTTTT 


TCCTGTTGCA 




GTAATATTGT 


5220 


TCTGGATATT ACCAGCAAGG 


CCGATAGTTT 


GAGTTCTTCT 


Ao A UAuuUtA 


GTGATGTTAT 


5280 


TACTAATCAA AGAAGTATTG 


CTACAACGGT 




\Jt\ X vyAwnvA 


CTCTTTTACT 


5340 


CGGTGGCCTC ACTGATTATA 


AAAACA&TTC 


TP A A flAVIY'T 


uuliu lAliUii 1 


TCCTGTCTAA 


5400 


AATCCCTTTA ATCGGCCTCC 


TGTTTAGQTC 






AAAGCACGTT 


5460 


ATACGTGCTC GTCAAAGCAA 


CCATAGTACG 




rrrrrrATTA 
UbuUbLiAi in 


AGCGCGGCGG 


5520 


GTGTGGTGGT TACGCGCAGC 


gtgaccgctaN 






CCCGCTCCTT 


5580 


TCGCTTTCTT CCCTTCCTTT 


CTCGCCACG^ 


JKCGCCGGCTT 




GCTCTAAATC 


5640 


GGGGGCTCCC TTTAGGGTTC 


CGATTTAGTG 




mrrxrrrr 


AAAAAACTTG 


5700 


ATTTGGGTGA TGGTTCACGT 


AGTGGGCCAT 




GACGGTTTTT 


CGCCCTTTGA 


5760 


CGTTGGAGTC CACO 1 1 o l i l 


AATAGTGGAC 


TCTTSTTCCA 


AAPTCnAAfA 

flftw 1 UUflAWi 


ACACTCAACC 


5820 


CTATCTCGGG CTATTCTTTT 


GATTTATAAG 






CCACCATCAA 


5880 


ACAGGATTTT CGCCTGCTGG 


GGCAAACCAG 


A uunuUit 




TCTCTCAGGG 


5940 


CCAGGCGGTG AAGGGCAATC 


AGCTGTTGCC 


CGTCTCGCTG 


GTGAAAAGAA 


AAACCACCCT 


6000 


GGCGCCCAAT ACGCAAACCG 


CCTCTCCCCG 


CGCGTTGGC5 


G ATT CATTAA 


TGCAGCTGGC 


6060 


ACGACAGGTT TCCCGACTGG 


AAAGCGGGCA 


CTCAfifftfiAA 

V A u/\U OvWW 


VrfifTAATTAAT 


GTGAGTTAGC 


6120 


TCACTCATTA GGCACCCCAG 


GCTTTACACT 




KCfTrflTATf 
uw^lwu Iniu 


TTGTGTGGAA 


6180 


TTGTGAGCGG ATAACAATTT 


CACACGCCAA 


vuAuAUAu X w 


AxAATGAAAT 


ACCTATTGCC 


6240 


TACGGCAGCC GCTGGATTGT 


TATTACTCGC 


TGCCCAACCA 


GCCATGGCCG 


AGCTCGTGAT 


6300 


GACCCAGACT CCAGATATCC 


AACAGGAATG 


AGTGTTAATT 


CTAGAACGCG 


TCACTTGGCA 


6360 


CTGGCCGTCG TTTTACAACG 


TCGTGACTGG 


GAAAACCCTG 


GCGTTACCCA 


AGCTTAATCG 


6420 


CCTTGCAGAA TTCCCTTTCG 


CGAGCTGGCG 


TAATAGCGAA 


GAGGCCCGCA 


CCGATCGCCC 


6480 


TTCCCAACAG TTGCGCAGCC 


TGAATGGCGA 


ATGGCGCTTT 


GccTGGTrrrc 


CGGCACCAGA 


6540 


AGCGGTGCCG GAAAGCTGGC 


TGGAGTGCGA 


TCTTCCTGAG 


GCCGATAGGG 


TCGTCGTCCC 


6600 


CTCAAACTGG CAGATGCACG 


GTTACGATGC 


GCCCATCTAC 


accaacgtXa 


CCTATCCCAT 


6660 


TACGGTCAAT CCGCCGTTTC 


TTCCCACGGA 


GAATCCGACG 


GCTTGTTACT 


CGCTCACATT 


6720 


TAATGTTGAT GAAAGCTGGC 


TACAGGAAGG 


CCAGACGCGA 


attatttttg\ 


ATGGCGTTCC 


6780 


TATTCGTTAA AAAATGAGCT 


GATTTAACAA 


AAATTTAACG 


CGAATTTTAA 


CAAAATATTA 


6840 
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ACGTTTACAA TTTAAATATT 


TGCTTATACA 


ATCTTCCTGT 


TTTTGGGGCT 


TTTCTGATTA 


6900 


TCAACCGGGG TACATATGAT 


TGACATGCTA 


GTTTTACGAT 


TACCGTTCAT 


cgattctctt 


6960 


GTTTGCTCCA GACTCT\CAGG 


CAATGACCTG 


ATAGCCTTTG 


TAGATCTCTC 


AAAAATAGCT 


7020 


ACCCTCTCCG GCATTAaVtT 


ATCAGCTAGA ACGGTTGAAT 


ATCATATTGA 


TGGTGATTTG 


7080 


ACTGTCTCCG GCCTTTCTOA 


CCCTTTTGAA 


TCTTTACCTA 


CACATTACTC 


AGGCATTGCA 


7140 


TTTAAAATAT ATGAGGGTTC\ 


TAAAAATTTT 


TATCCTTGCG 


TTGAAATAAA 


GGCTTCTCCC 


7200 


GCAAAAGTAT TACAGGGTCA 


TAATGTTTTT 


GGTACAACCG 


ATTTAGCTTT 


ATGGTCTGAG 


7260 


GCTTTATTGC TTAATTTTGC 


TAATTCTTTG 


CCTTGCCTGT 


ATGATTTATT 


GGATGTT 


7317 


(2) INFORMATION FOR SEQ ID NO: 3: 










(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7729\base> pairs 

(B) TYPE: nucleiA/acid 

(C) STRANDEDNESS/Vboch 

(D) TOPOLOGY: c(rdular 








(xi) SEQUENCE DESCRIPTION :\ SEQ ID NO: 3: 








AATGCTACTA CTATTAGTAG 


AATTCATGCC 


ACCTTTTCAG 


CTCGCGCCCC 


AAATGAAAAT 


60 


ATAGCTAAAC AGGTTATTGA 


CCATTTGCGA 


\AATGTATCTA 


ATGGTCAAAC 


TAAATCTACT 


120 


CGTTCGCAGA ATTGGGAATC 


AACTGTTACA 


TSGGAATGAAA 


CTTCCAGACA 


CCGTACTTTA 


180 


GTTGCATATT TAAAACATGT 


TGAGCTACAG 


CACCAGATTC 


AGCAATTAAG 


CTCTAAGCCA 


240 


TCTGCAAAAA TGACCTCTTA 


TCAAAAGGAG 


CAATTAAAGG 


TACTCTCTAA 


TCCTGACCTG 


300 


TTGGAGTTTG CTTCCGGTCT 


GGTTCGCTTT 


GAAGCTCGAA 


TTAAAACGCG 


ATATTTGAAG 


360 


TCTTTCGGGC TTCCTCTTAA 


TCTTTTTGAT 


GCAATCCGCT 


TTGCTTCTGA 


CTATAATAGT 


420 


CAGGGTAAAG ACCTGATTTT 


TGATTTATGG 


TCATTcVcGT 


TTTCTGAACT 


GTTTAAAGCA 


480 


TTTGAGGGGG ATTCAATGAA 


TATTTATGAC 


GATTCCGCAG 


TATTGGACGC 


TATCCAGTCT 


540 


AAACATTTTA CTATTACCCC 


CTCTGGCAAA 


ACTTCTTTTG 


CAAAAGCCTC 


TCGCTATTTT 


600 


GGTTTTTATC GTCGTCTGGT 


AAACGAGGGT 


TATGATAGTG 


TTGCTCTTAC 


TATGCCTCCT 


660 


AATTCCTTTT GGCGTTATGT 


ATCTGCATTA 


GTTGAATGTG 


\gtattcctaa 


ATCTCAACTG 


720 


ATGAATCTTT CTACCTGTAA 


TAATGTTGTT 


CCGTTAGTTC 


OTTTTATTAA 


CGTAGATTTT 


780 


TCTTCCCAAC GTCCTGACTG 


GTATAATCAG 


CCAGTTCTTA 


AAATCGCATA 


AGGTAATTCA 


840 


CAATGATTAA AGTTGAAATT 


AAACCATCTC 


AAGCCCAATT 


TACTACTCGT 


TCTGGTGTTT 


900 


CTCGTCAGGG CAAGCCTTAT 


TCACTGAATG 


AGCAGCTTTG 


TTACffTTGAT 


TTGGGTAATG 


960 


AATATCCGGT TCTTCTCAAG 


ATTACTCTTG 


ATGAAGGTCA 


GCCAGOCTAT 


GCGCCTGGTC 


1020 


TGTACACCGT TCATCTGTCC 


TCTTTCAAAG 


TTGGTCAGTT 


CGGTTCOCTT 


ATGATTGACC 


1080 


GTCTGCGCCT CGTTCCGGCT 


AAGTAACATG 


GAGCAGGTCG 


CGGATTTCGA 


CACAATTTAT 


1140 


CAGGCGATGA TACAAATCTC 


CGTTGTACTT 


TGTTTCGCGC 


ttggtataaV 


CGCTGGGGGT 


1200 
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CAAAGATGAG 


TGTTTTAGTG 


TATTCTTTCG 


CCTCTTTCGT 


TTTAGGTTGG TGCCTTCGTA 


1260 


GTGGCATTAC 


gVattttacc 


CGTTTAATGG 


AAACTTCCTC 


ATGAAAAAGT CTTTAGTCCT 


1320 


CAAAGCCTCT 


gtXgccgttg 


CTACCCTCGT 


TCCGATGCTG 


TCTTTCGCTG CTGAGGGTGA 


1380 


CGATCCCGCA 


AAAGCGGCCT 


TTAACTCCCT 


GCAAGCCTCA 


GCGACCGAAT ATATCGGTTA 


1440 


TGCGTGGGCG 


ATGGTrGTTG 


TCATTGTCGG 


CGCAACTATC 


GGTATCAAGC TGTTTAAGAA 


1500 


ATTCACCTCG 


AAAGCAAGCT 


GATAAACCGA 


TACAATTAAA 


GGCTCCTTTT GGAGCCTTTT 


1560 


TTTTTGGAGA 


TTTTCAACGT 


GAAAAAATTA 


TTATTCGCAA TTCCTTTAGT TGTTCCTTTC 


1620 


TATTCTCACT 


CCGCTGAAAC\ 


TGTTGAAAGT 


TGTTTAGCAA AACCCCATAC AGAAAATTCA 


1680 


TTTACTAACG 


TCTGGAAAGA 


GIGACAAAACT 


TTAGATCGTT 


ACGGTAACTA TGAGGGTTGT 


1740 


CTGTGGAATG 


CTACAGGCGT 


TGTAGTTTGT 


ACTGGTGACG 


AAACTCAGTG TTACGGTACA 


1800 


TGGGTTCCTA 


TTGGGCTTGC 


TATTCCTGAA 


AATGAGGGTG 


GTGGCTCTGA GGGTGGCGGT 


1860 


TCTGAGGGTG 


GCGGTTCTGA GGGTGGCGGT 


ACTAAACCTC 


CTGAGTACGG TGATACACCT 


1920 


ATTCCGGGCT 


ATACTTATAT 


CAACCOTCTC 


GACGGCACTT 


ATCCGCCTGG TACTGAGCAA 


1980 


AACCCCGCTA ATCCTAATCC 


TTCTCTTGAG 


GAGTCTCAGC 


CTCTTAATAC TTTCATGTTT 


2040 


CAGAATAATA GGTTCCGAAA TAGGCAGGGG 


GCATTAACTG 


TTTATACGGG CACTGTTACT 


2100 


CAAGGCACTG 


ACCCCGTTAA AACTTATTOC 


CAGTACACTC 


CTCTATCATC AAAAGCCATG 


2160 


TATGACGCTT ACTGGAACGG 


TAAATTCAG^ 


-gactgcgctt 


TCCATTCTGG CTTTAATGAA 


2220 


GATCCATTCG 


TTTGTGAATA TCAAGGCCAA 


VcGTCTGACC 


TGCCTCAACC TCCTGTCAAT 


2280 


GCTGGCGGCG 


GCTCTGGTGG 


TGGTTCTGGT 


GI5CGGCTCTG 


AGGGTGGTGG CTCTGAGGGT 


2340 


GGCGGTTCTG 


AGGGTGGCGG 


CTCTGAGGGA 


GGSGGTTCCG 


GTGGTGGCTC TGGTTCCGGT 


2400 


GATTTTGATT 


ATGAAAAGAT 


GGCAAACGCT 


AATAAGGGGG 


CTATGACCGA AAATGCCGAT 


2460 


GAAAACGCGC 


TACAGTCTGA 


CGCTAAAGGC 


AAACTTGATT 


CTGTCGCTAC TGATTACGGT 


2520 


GCTGCTATCG 


ATGGTTTCAT 


TGGTGACGTT 


TCCGGCCTTG 


CTAATGGTAA TGGTGCTACT 


2580 


GGTGATTTTG 


CTGGCTCTAA TTCCCAAATG 


GCTCAAGTCG 


GTGACGGTGA TAATTCACCT 


2640 


TTAATGAATA ATTTCCGTCA ATATTTACCT 


TCCCTCCCTC 


AATCGGTTGA ATGTCGCCCT 


2700 


TTTGTCTTTA GCGCTGGTAA ACCATATGAA 


TTTTCTATTQ 


, ATTGTGACAA AATAAACTTA 


2760 


TTCCGTGGTG TCTTTGCGTT TCTTTTATAT 


GTTGCCACCT 


TTATGTATGT ATTTTCTACG 


2820 


TTTGCTAACA TACTGCGTAA 


TAAGGAGTCT 


TAATCATGCC 


AGTTCTTTTG GGTATTCCGT 


2880 


TATTATTGCG 


TTTCCTCGGT 


TTCCTTCTGG 


TAACTTTGTT 


CGGCTATCTG CTTACTTTTC 


2940 


TTAAAAAGGG 


CTTCGGTAAG 


ATAGCTATTG 


CTATTTCATT 


GTrkrrTGCT cttattattg 


3000 


GGCTTAACTC 


AATTCTTGTG 


GGTTATCTCT 


CTGATATTAG 


CGCTQAATTA CCCTCTGACT 


3060 


TTGTTCAGGG TGTTCAGTTA ATTCTCCCGT 


CTAATGCGCT 


TCCCTOTTTT TATGTTATTC 


3120 


TCTCTGTAAA GGCTGCTATT TTCATTTTTG 


ACGTTAAACA AAAAATOGTT TCTTATTTGG 


3180 


ATTGGGATAA 


ATAATATGGC 


TGTTTATTTT 


GTAACTGGCA AATTAGGCTC TGGAAAGACG 


3240 



CTCGTTAGCG 


tTGGTAAGAT TCAGGATAAA 


52 

ATTGTAGCTG 


GGTGCAAAAT 


AGCAACTAAT 


3300 


CTTGATTTAA 


G&CTTCAAAA CCTCCCGCAA 


GTCGGGAGGT 


TCGCTAAAAC 


GCCTCGCGTT 


3360 


CTTAGAATAC 


CGGATAAGCC TTCTATATCT 


GATTTGCTTG 


CTATTGGGCG 


CGGTAATGAT 


3420 


TCCTACGATG 


aaaaVaaaaa CGGCTTGCTT 


GTTCTCGATG 


AGTGCGGTAC 


TTGGTTTAAT 


3480 


ACCCGTTCTT 


GGAATGATAA GGAAAGACAG 


CCGATTATTG 


ATTGGTTTCT 


ACATCCTCGT 


3540 


AAATTAGGAT 


GGGATATTAT ttttcttgtt 


CAGGACTTAT 


CTATTGTTGA 


TAAACAGGCG 


3600 


CGTTCTGCAT 


TAGCTCAAEA TGTTGTTTAT 


TGTCGTCGTC 


TGGACAGAAT 


TACTTTACCT 


3660 


TTTGTCGGTA 


CTTTATATTa TCTTATTACT 


GGCTCGAAAA 


TGCCTCTGCC 


TAAATTACAT 


3720 


GTTGGCGTTG 


TTAAATATGG VgATTCTCAA 


TTAAGCCCTA 


CTGTTGAGCG 


TTGGCTTTAT 


3780 


ACTGGTAAGA 


ATTTCTATAA CSCATATGAT 


ACTAAACAGG 


CTTTTTCTAG 


TAATTATGAT 


3840 


TCCGGTGTTT 


ATTCTTATTT AAOGCCTTAT 


TTATCACACG 


GTCGGTATTT 


CAAACCATTA 


3900 


AATTTAGGTC 


AGAAGATGAA GCTT^CTAAA 


ATATATTTGA 


AAAAGTTTTC 


ACGCGTTCTT 


3960 


TGTCTTGCGA 


TTGGATTTGC ATCAGqATTT 


ACATATAGTT 


ATATAACCCA 


ACCTAAGCCG 


4020 


GAGGTTAAAA AGGTAGTCTC TCAGACOTAT 


"gattttgata 


AATTCACTAT 


TGACTCTTCT 


4080 


CAGCGTCTTA ATCTAAGCTA TCGOTATgYt 


TTCAAGGATT 


CTAAGGGAAA 


ATTAATTAAT 


4140 


AGCGACGATT TACAGAAGCA AGGTTATTCSy 


CTCACATATA 


TTGATTTATG 


TACTGTTTCC 


4200 


ATTAAAAAAG 


GTAATTCAAA TGAAATTGTT 


AAATGTAATT 


AATTTTGTTT 


TCTTGATGTT 


4260 


TGTTTCATCA 


TCTTCinru CTCAGGTAAT 


TGAAATGAAT 


AATTCGCCTC 


TGCGCGATTT 


4320 


TGTAACTTGG 


TATTCAAAGC AATCAGGCGA 


ATCJSGTTATT 


GTTTCTCCCG 


ATGTAAAAGG 


4380 


TACTGTTACT 


GTATATTCAT CTGACGTTAA 


acctgVaaat 


CTACGCAATT 


TCTTTATTTC 


4440 


TGTTTTACGT 


GCTAATAATT TTGATATGGT 


tggttcaatt 


CCTTCCATAA 


TTCAGAAGTA 


4500 


TAATCCAAAC 


AATCAGGATT ATATTGATGA 


ATTGCCAT^A 


TCTGATAATC 


AGGAATATGA 


4560 


TGATAATTCC 


CCTCCTTCTC GTGGTTTCTT 


tgttccgcaaN 


l AATGATAATG 


TTACTCAAAC 


4620 


TTTTAAAATT AATAACGTTC GGGCAAAGGA 


TTTAATACGA 


OTTGTCGAAT 


TGTTTGTAAA 


4680 


GTCTAATACT TCTAAATCCT CAAATGTATT 


ATCTATTGAC 


GGCTCTAATC 


TATTAGTTGT 


4740 


TAGTGCACCT AAAGATATTT TAGATAACCT 


TCCTCAATTC 


CTTTCTACTG 


TTGATTTGCC 


4800 


AACTGACCAG ATATTGATTG AGGGTTTGAT 


ATTTGAGGTT 


cagcAggtg 


ATGCTTTAGA 


4860 


TTTTTCATTT GCTGCTGGCT CTCAGCGTGG 


CACTGTTGGA 


GGCGGTOTTA 


ATACTGACCG 


4920 


CCTCACCTCT GTTTTATCTT CTGCTGCTGC 


TTCGTTCGGT 


ATTTTTAATG 


GCGATGTTTT 


4980 


AGGGCTATCA GTTCGCGCAT TAAAGACTAA 


TAGCCATTCA 


aaaatattgt: 


CTGTGCCACG 


5040 


TATTCTTACG CTTTCAGGTC AGAAGCGTTC 


TATCTCTGTT 


GGCCAGAATG 


VcCCTTTTAT 


5100 


TACTGGTCGT GTGACTGGTG AATCTGCCAA 


TGTAAATAAT 


CCATTTCAGA 


CCATTGAGCG 


5160 


TCAAAATGTA GGTATTTCCA TGAGCCTTTT 


TCCTGTTGCA 


ATGGCTGGCG 


gtaVtattct 


5220 


TCTGGATATT ACCAGCAAGG CCGATAGTTT 


GAGTTCTTCT 


ACTCAGGCAA 


GTGATGTTAT 


5280 
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TACTAATCAA AGAAGTAtTG CTACAACGGT TAATTTGCGT GATGGACAGA CTCTTTTACT 5340 

CGGTGGCCTC ACTGATT&TA AAAACACTTC TCAAGATTCT GGCGTACCGT TCCTGTCTAA 5400 

AATCCCTTTA ATCGGCCTCC TGTTTAGCTC CCGCTCTGAT TCCAACGAGG AAAGCACGTT 5460 

ATACGTGCTC GTCAAAGCAA CCATAGTACG CGCCCTGTAG CCGCGCATTA AGCGCGGCGG 5520 

GTGTGGTGGT TACGCGCAGc\gTGACCGCTA CACTTGCCAG CGCCCTAGCG CCCGCTCCTT 5580 

TCGCTTTCTT CCCTTCCTTT GTCGCCACGT TCGCCGGCTT TCCCCGTCAA GCTCTAAATC 5640 

GGGGGCTCCC TTTAGGGTTC CQATTTAGTG CTTTACGGCA CCTCGACCCC AAAAAACTTG 5700 

ATTTGGGTGA TGGTTCACGT AGTCGGCCAT CGCCCTGATA GACGGTTTTT CGCCCTTTGA 5760 

CGTTGGAGTC CACGTTCTTT AATAGTGGAC TCTTGTTCCA AACTGGAACA ACACTCAACC 5820 

CTATCTCGGG CTATTCTTTT GATTTATAAG GGATTTTGCC GATTTCGGAA CCACCATCAA 5880 

ACAGGATTTT CGCCTGCTGG GGCAAACCAG CGTGGACCGC TTGCTGCAAC TCTCTCAGGG 5940 

CCAGGCGGTG AAGGGCAATC AGCTGTTCCC CGTCTCGCTG GTGAAAAGAA AAACCACCCT 6000 

GGCGCCCAAT ACGCAAACCG CCTCTCCC1CG CGCGTTGGCC GATTCATTAA TGCAGCTGGC 6060 

ACGACAGGTT TCCCGACTGG AAAGCGGGcVgTgYgCGCAA CGCAATTAAT GTGAGTTAGC 6120 

TCACTCATTA GGCACCCCAG GCTTTAC^pVTATGCXTCC GGCTCGTATG TTGTGTGGAA 6180 

TTGTGAGCGG ATAACAATTT C^CACGCGTCACTTGGCACT GGCCGTCGTT TTACAACGTC 6240 

GTGACTGGGA AAACCCTGGC GTTACCCAAG CTTrGTACAT GGAGAAAATA AAGTGAAACA 6300 

AAGCACTATT GCACTGGCAC TCTTACCGTT ACTGTTTACC CCTGTGGCAA AAGCCCAGGT 6360 

CCAGCTGCTC GAGTCGGTCT TCCCCCTGGC ACCCTCCTCC AAGAGCACCT CTGGGGGCAC 6420 

AGCGGCCCTG GGCTGCCTGG TCAAGACTAA TTCCCCGAAC CGGTGACGGT GTCGTGGAAC 6480 

TCAGGCGCCC TGACCAGCGG CGTGCACACC TTCCCGGcVg TCCTACAGTC CTCAGGACTC 6540 

TACTCCCTCA CCAGCGTGGT GACCGTGCCC TCCAGCAGc\ TGGGCACCCA GACCTACATC 6600 

TCCAACCTGA ATCACAAGCC CAGCAACACC AAGCTCGACaVgaAAGCAGA GCCCAAATCT 6660 

TGTACTAGTG GATCCTACCC GTACGACGTT CCGGACTACG CTrCTTAGCC TGAAGGCGAT 6720 

GACCCTGCTA AGGCTGCATT CAATAGTTTA CAGGCAAGTG CTACTGAGTA CATTGGCTAC 6780 

GCTTGGGCTA TGGTAGTAGT TATAGTTGCT GCTACCATAG GGArTAAATT ATTCAAAAAG 6840 

TTTACGAGCA AGGCTTCTTA AGCAATAGCG AAGAGGCCCG CACCGATCGC CCTTCCCAAC 6900 

AGTTGCGCAG CCTGAATGGC GAATGGCGCT TTGCCTGGTT TCCGGCACCA GAAGCGGTGC 6960 

CCGAAAGCTG GCTGGAGTGC GATCTTCCTG AGGCCGATAC GGTCGTCGTC CCCTCAAACT 7020 

GGCAGATGCA CGGTTACGAT GCGCCCATCT ACACCAACGT AACCTATCCCx ATTACGGTCA 7080 

ATCCGCCGTT TGTTCCCACG GAGAATCCGA CGGGTTGTTA CTCGCTCACA VrTAATGTTG 7140 

ATGAAAGCTG GCTACAGGAA GGCCAGACGC GAATTATTTT TGATGCCGTT CSTATTGGTT 7200 

AAAAAATGAG CTGATTTAAC AAAAATTTAA CGCGAATTTT AACAAAATAT TAACGTTTAC 7260 

AATTTAAATA TTTGCTTATA CAATCTTCCT CTTTTT GGGG CTTTTCTGAT TATcVaCCGG 7320 
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GGTACATATG ATTGACATGC TAGTTTTACG ATTACCGTTC ATCGATTCTC TTGTTTGCTC 7380 

CAGACTCTCA GGCAATGACC TGATAGCCTT TGTAGATCTC TCAAAAATAG CTACCCTCTC 7440 

CGGCATTAAT TTATcYgCTA GAACGGTTGA ATATCATATT GATGGTGATT TGACTGTCTC 7500 

CGGCCTTTCT CACCCT7TTG AATCTTTACC TACACATTAC TCAGGCATTG CATTTAAAAT 7560 

ATATGAGGGT TCTAAAAaVt TTTATCCTTG CGTTGAAATA AAGGCTTCTC CCGCAAAAGT 7620 

ATTACAGGGT CATAATGTTT\ TTGGTACAAC CGATTTAGCT TTATGCTCTG AGGCTTTATT 7680 

GCTTAATTTT GCTAATTCTT VgCCTTGCCT GTATGATTTA TTGGACGTT 7729 
(2) INFORMATION FOR SEq\iD NO: 4: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 755V base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESSX both 

(D) TOPOLOGY: cirWlir 

(xi) SEQUENCE DESCRIPTION^ SEQ ID N0:4: 

AATGCTACTA CTATTAGTAG AATTGATGOC ACCTTTTCAG CTCGCGCCCC AAATGAAAAT 60 

ATAGCTAAAC AGGTTATTGA CCATTTGCGM AATGTATCTA ATGGTCAAAC TAAATCTACT 120 

CGTTCGCAGA ATTGGGAATC AACTGTTACA TJGGAATGAAA CTTCCAGACA CCGTACTTTA 180 

GTTGCATATT TAAAACATGT TGAGCTACAG CACCAGATTC AGCAATTAAG CTCTAAGCCA 240 

TCCGCAAAAA TGACCTCTTA TCAAAAGGAG ' CAATTAAAGG TACTCTCTAA TCCTGACCTG 300 

TTGGAGTTTG CTTCCGGTCT GGTTCGCTTT GAAGOTCGAA TTAAAACGCG ATATTTGAAG 360 

TCTTTCGGGC TTCCTCTTAA T CTTT T T GAT GCAATCCGCT TTGCTTCTGA CTATAATAGT 420 

CAGGGTAAAG ACCTGATTTT TGATTTATGG TCATTCT(SGT TTTCTGAACT GTTTAAAGCA 480 

TTTGAGGGGG ATTCAATCAA TATTTATGAC GATTCCGCAG TATTGGACGC TATCCAGTCT 540 

AAACATTTTA CTATTACCCC CTCTGGCAAA ACTTCTTTTc\ CAAAAGCCTC TCGCTATTTT 600 

GGTTTTTATC GTCGTCTGGT AAACGAGGGT TATGATAGTG TTGCTCTTAC TATGCCTCGT 660 

AATTCCTTTT GGCGTTATGT ATCTGCATTA GTTGAATGTG GTATTCCTAA ATCTCAACTG 720 

ATGAATCTTT CTACCTGTAA TAAT G TT G TT CCGTTAGTTC GTTXTATTAA CGTAGATTTT 780 

TCTTCCCAAC GTCCTGACTG GTATAATGAG CCAGTTCTTA AAATCGCATA AGGTAATTCA 840 

CAATGATTAA AGTTGAAATT AAACCATCTC AAGCCCAATT TACTACrrCGT TCTGGTGTTT 900 

CTCGTCAGGG CAAGCCTTAT TCACTGAATG AGCAGCTTTC TTACGtWt TTGGGTAATG 960 

AATATCCGGT TCTTGTCAAG ATTACTCTTG ATGAAGGTCA GCCAGCCTAT GCGCCTGGTC 1020 

TGTACACCGT TCATCTGTCC TCTTTCAAAG TTGGTCAGTT CGGTTCCCtV ATGATTGACC 1080 

GTCTGCGCCT CGTTCCGGCT AAGTAACATG GAGCAGGTCG CGGATTTCGa\cACAATTTAT 1140 

CAGGCGATGA TACAAATCTC CGTTGTACTT TGTTTCGCGC TTGGTATAAT MCTGGGGGT 1200 

CAAAGATGAG TGTTTTAGTG TATTCTTTCG CCTCTTTCGT TTTAGGTTGG TGCCTTCGTA 1260 
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GTGGCATTAC gAtTTTACC CGTTTAATGG AAACTTCCTC ATGAAAAAGT CTTTAGTCCT 1320 

CAAAGCCTCT GTACSCCGTTG CTACCCTCGT TCCGATGCTG TCTTTCGCTG CTGAGGGTGA 1380 

CGATCCCGCA AAAGCGGCCT TTAACTCCCT GCAAGCCTCA GCGACCGAAT ATATCGGTTA 1440 

TGCGTGGGCG ATGGTTOTTG TCATTGTCGG CGCAACTATC ,GGTATCAAGC TGTTTAAGAA 1500 

ATTCACCTCG AAAGCAAGoff GATAAACCGA TACAATTAAA GGCTCCTTTT GGAGCCTTTT 1560 

TTTTTGGAGA TTTTCAACGTv GAAAAAATTA TTATTCGCAA TTCCTTTAGT TGTTCCTTTC 1620 

TATTCTCACT CCGCTGAAAC YgTTGAAAGT TGTTTAGCAA AACCCCATAC AGAAAATTCA 1680 

TTTACTAACG TCTGGAAAGA COACAAAACT TTAGATCGTT ACGCTAACTA TGAGGGTTGT 1740 

CTGTGGAATG CTACAGGCGT TGTlAGTTTGT ACTGGTGACG AAACTCAGTG TTACGGTACA 1800 

TGGGTTCCTA TTGGGCTTGC TATCCCTGAA AATGAGGGTG GTGGCTCTGA GGGTGGCGGT 1860 

TCTGAGGGTG GCGGTTCTGA GGGTGlCCGGT ACTAAACCTC CTGAGTACGG TGATACACCT 1920 

ATTCCGGGCT ATACTTATAT CAACCCTCTC GACGGCACTT ATCCGCCTGG TACTGAGCAA 1980 

AACCCCGCTA ATCCTAATCC TTCTCTTGAG^-GAGTCTCAGC CTCTTAATAC TTTCATGTTT 2040 

CAGAATAATA GGTTCCGAAA TAGGCA0GGG GCATTAACTG TTTATACGGG CACTGTTACT 2100 

CAAGGCACTG ACCCCGTTAA AACTTATXACvCAGTACACTC CTGTATCATC AAAAGCCATG 2160 

TATGACGCTT ACTGGAACGG TAAATTCAGA SACTGCGCTT TCCATTCTGG CTTTAATGAA 2220 

GATCCATTCG TTTGTGAATA TCAAGGCCAA TQGTCTGACC TGCCTCAACC TCCTGTCAAT 2280 

GCTGGCGGCG GCTCTGGTGG TGGTTCTGGT GGCGGCTCTG AGGGTGGTGG CTCTGAGGGT 2340 

GGCGGTTCTG AGGGTGGCGG CTCTGAGGGA GGCG<STTCCG GTGGTGGCTC TGGTTCCGGT 2400 

GATTTTGATT ATGAAAAGAT GGCAAACGCT AATAAGGGGG CTATGACCGA AAATGCCGAT 2460 

GAAAACGCGC TACAGTCTGA CGCTAAAGGC AAACTTGATT CTGTCGCTAC TGATTACGGT 2520 

GCTGCTATCG ATGGTTTCAT TGGTGACGTT TCCGGCCTTG CTAATGGTAA TGGTGCTACT 2580 

GGTGATTTTG CTGGCTCTAA TTCCCAAATG GCTCAAGTCG. GTGACGGTGA TAATTCACCT 2640 

TTAATGAATA ATTTCCGTCA ATATTTACCT TCCCTCCCTcVaTCGGTTGA ATGTCGCCCT 2700 
TTTGTCTTTA GCGCTGGTAA ACCATATGAA TTTTCTATTG ATTGTGACAA AATAAACTTA 2760 
TTCCGTGGTG TCTTTGCGTT TCTTTTATAT GTTGCCACCT tAtGTATGT ATTTTCTACG 2820 
TTTGCTAACA TACTGCCTAA TAAGGAGTCT TAATCATGCC AGTTCTTTTG GGTATTCCGT 2880 
TATTATTGCG TTTCCTCGGT TTCCTTCTGG TAACTTTGTT CGGCTATCTG CTTACTTTTC 2940 
TTAAAAAGGG CTTCGGTAAG ATAGCTATTG CCTGTTTCTT GCTCTTATXA TTGGGCTTAA 3000 
CTCAATTCTT GTGGGTTATC TCTCTGATAT TAGCGCTCAA TTACCCTtTG ACTTTGTTCA 3060 
GGGTGTTCAG TTAATTCTCC CGTCTAATGC GCTTCCCTGT TTTTATCTTA TTCTCTCTGT 3120 
AAAGGCTGCT ATTTTCATTT TTGACCTTAA ACAAAAAATC GTTTCTTATry TGGATTGGGA 3180 
TAAATAATAT GGCTGTTTAT TTTGTAACTC GCAAATTAGG CTCTGGAAAG VcGCTCGTTA 3240 
GCGTTGGTAA GATTCAGGAT AAAATTGTAG CTGGGTGCAA AATAGCAACT AATCTTGATT 3300 
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TAAGGCTTCA AAAGCTCCCG CAAGTCGGGA GGTTCGCTAA AACGCCTCGC GTTCTTAGAA 3360 

TACCGGATAA GCCTTCTATA TCTGATTTGC TTGCTATTGG GCGCGGTAAT GATTCCTACG 3420 

ATGAAAATAA AAACGGOvTTG CTTGTTCTCG ATGAGTGCGG TACTTGGTTT AATACCCGTT 3480 

CTTGGAATGA TAAGCAAAGA CAGCCGATTA TTGATTGGTT TCTACATGCT CGTAAATTAG 3540 

GATGGGATAT TATTTTTCTT\GTTCAGGACT TATCTATTGT TGATAAACAG GCGCGTTCTG 3600 

CATTAGCTCA ACATGTTGTT TATTGTCGTC GTCTGGACAG AATTACTTTA CCTTTTGTCG 3660 

GTACTTTATA TTCTCTTATT AOTGGCTCGA AAATGCCTCT GCCTAAATTA CATGTTGGCG 3720 

TTGTTAAATA TGGCGATTCT CAATTAAGCC CTACTGTTGA GCGTTGGCTT TATACTGGTA 3780 

AGAATTTGTA TAACGCATAT GATACTAAAC AGGCTTTTTC TAGTAATTAT GATTCCGGTG 3840 

TTTATTCTTA TTTAACGCCT TATTTATCAC ACGGTCGGTA TTTCAAACCA TTAAATTTAG 3900 

GTCAGAAGAT GAAGCTTACT AAAATaWt TGAAAAAGTT TTCACGCGTT CTTTGTCTTG 3960 

CGATTGGATT TGCATCAGCA TTTACATAjjptnSTTATATAAC CCAACCTAAG CCGGAGGTTA 4020 

AAAAGGTAGT CTCTCAGACC TATGAT3TTC ATAAATTCAC TATTGACTCT TCTCAGCGTC 4080 

TTAATCTAAG CTATCGCTAT GTTTTSAAee^TTCTAAGGG AAAATTAATT AATAGCGACG 4140 

ATTTACAGAA GCAAGGTTAT TCACTCACAT ATATTGATTT ATGTACTGTT TCCATTAAAA 4200 

AAGGTAATTC AAATGAAATT GTTAAATGTA ATTAATTTTG TTTTCTTGAT GTTTGTTTCA 4260 

TCATCTTCTT TTGCTCAGCT AATTGAAATG AATj^ATTCGC CTCTGCGCGA TTTTCTAACT 4320 

TGGTATTCAA AGCAATCAGG CGAATCCGTT ATTGTTTCTC CCGATGTAAA AGGTACTGTT 4380 

ACTGTATATT CATCTGACCT TAAACCTGAA AATCTACGCA ATTTCTTTAT TTCTGTTTTA 4440 

CGTGCTAATA ATTTTGATAT GGTTGGTTCA ATTCCTTCCA TAATTCAGAA GTATAATCCA 4500 

AACAATCAGG ATTATATTGA TGAATTGCCA TCATCTGaYa ATCAGGAATA TGATGATAAT 4560 

TCCGCTCCTT CTGGTGGTTT CTTTGTTCCG CAAAATGATA ATGTTACTCA AACTTTTAAA 4620 

ATTAATAACG TTCGGGCAAA GGATTTAATA CGAGTTGTCG\AATTCTTTGT AAAGTCTAAT 4680 

ACTTCTAAAT CCTCAAATGT ATTATCTATT GACGGCTCTA AXCTATTAGT TGTTAGTGCA 4740 

CCTAAAGATA TTTTAGATAA CCTTCCTCAA TTCCTTTCTA CTGTTGATTT GCCAACTGAC 4800 

CAGATATTGA TTGAGGGTTT GATATTTGAG GTTCACCAAG GTgVtGCTTT AGATTTTTCA 4860 

TTTGCTGCTG GCTCTCAGCG TGGCACTGTT GCAGGCGGTG TTAaVaCTGA CCGCCTCACC 4920 

TCTGTTTTAT CTTCTGCTGG TGCTTCCTTC GGTATTTTTA ATGCCOATGT TTTAGGGCTA 4980 

TCAGTTCGCG CATTAAAGAC TAATAGCCAT TCAAAAATAT TGTCTGTGCC ACGTATTCTT 5040 

ACGCTTTCAG GTCAGAAGGC TTCTATCTCT GTTGGCCAGA ATGTCCCTTT TATTACTGGT 5100 

CGTGTGACTG GTGAATCTGC CAATGTAAAT AATCCATTTC AGACGATTgV GCGTCAAAAT 5160 

GTAGGTATTT CCATGAGCGT TTTTCCTGTT GCAATCGCTG GCGGTAATATV TGTTCTGGAT 5220 

ATTACCAGCA AGGCCGATAG TTTGAGTTCT TCTACTCAGG CAAGTGATGT TATTACTAAT 5280 

CAAAGAAGTA TTGCTACAAC GGTTAATTTG CGTGATGGAC AGACTCTTTT ACTCGGTGGC 5340 
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CTCACTGATT ATAAaAcAC TTCTCAAGAT TCTGGCGTAC CGTTCCTGTC TAAAATCCCT 5400 

TTAATCGGCC TCCTGTTOAG CTCCCGCTCT GATTCCAACG AGGAAAGCAC GTTATACGTG 5460 

CTCGTCAAAG CAACCATAGT ACGCGCCCTG TAGCGGCGCA TTAAGCGCGG CGGGTGTGGT 5520 

GGTTACGCGC AGCGTGACCC CTACACTTGC CAGCGCCCTA GCGCCCGCTC CTTTCGCTTT 5580 

CTTCCCTTCC TTTCTCGCCaXcGTTCGCCGG CTTTCCCCGT CAAGCTCTAA ATCGGGGGCT 5640 

CCCTTTAGGG TTCCGATTTA iSTGCTTTACG GCACCTCGAC CCCAAAAAAC TTGATTTGGG 5700 

TGATGGTTCA CGTAGTGGGC CATCGCCCTG ATAGACGGTT TTTCGCCCTT TGACGTTGGA 5760 

GTCCACGTTC TTTAATAGTG GAcYcTTGTT CCAAACTGGA ACAACACTCA ACCCTATCTC 5820 

GGGCTATTCT TTTGATTTAT AAGGGATTTT GCCGATTTCG GAACOACCAT CAAACAGGAT 5880 

TTTCGCCTGC TGGGGCAAAC CAGCGTCGAC CGCTTGCTGC AACTCTCTCA GGGCCAGGCG 5940 

GTGAAGGGCA ATCAGCTGTT GCCCGTCTCG CTGGTGAAAA GAAAAACCAC CCTGGCGCCC 6000 

AATACGCAAA CCGCCTCTCC CCGCGCGTTC GCCGATTCAT TAATGCAGCT GGCACGACAG 6060 

GTTTCCCGAC TGGAAAGCGG GCAGTGAGCtS CAACGCAATT AATGTGAGTT AGCTCACTCA 6120 

TTAGGCACCC CAGGCTTTAC ACTTTATGCtVcCGGCTCGT ATGTTCTCTG GAATTGTGAG 6180 

CGGATAACAA TTTCACACGC CAAGGAGAGAVQICAIAATGA AATACCTATT GCCTACGGCA 6240 

GCCGCTGGAT TGTTATTACT CGCTGCCCAA CcYgCCATGG CCGAGCTCTT CCCGCCATCT 6300 

GATGAGCAGT TGAAATCTGG AAGTGCCTCT GTTOTGTGCC TGCTGAATAA CTTCTATCCC 6360 

AGAGAGGCCA AAGTACAGTG CAAGCTGGAT AACGCCCTCC AATCGGGTAA CTCCCAGGAG 6420 

AGTGTCACAG AGCAGGACAG CAAGGACAGC ACCTACAGCC TCAGCAGCAC CCTGACGCTG 6480 

AGCAAAGCAG ACTACGAGAA ACACAAAGTC TACGCCTQCG AAGTCACCCA TCAGGGCCTG 6540 

AGCTCGCCCG TCACAAAGAG CTTCAACAGG GGAGAGTGTT CTAGAACGCG TCACTTGGCA 6600 

CTGGCCGTCG TTTTACAACG TCGTGACTGG GAAAACCCTgXgCGTTACCCA AGCTTAATCG 6660 

CCTTGCAGAA TTCCCTTTCG CCAGCTGGCG TAATAGCGAA OAGGCCCGCA CCGATCGCCC 6720 

TTCCCAACAG TTGCGCAGCC TGAATGGCGA ATGGCGCTTT GCCTGGTTTC CGGCACCAGA 6780 

AGCGGTGCCG GAAAGCTGGC TGGAGTGCGA TCTTCCTGAG GCCGATACGG TCGTCGTCCC 6840 

CTCAAACTGG CAGATGCACG GTTACGATGC GCCCATCTAC ACCaVcGTAA CCTATCCCAT 6900 

TACGGTCAAT CCGCCGTTTG TTCCCACGGA GAATCCGACG GGTTGTTACT CGCTCACATT 6960 

TAATCTTGAT GAAAGCTGGC TACAGGAAGG CCAGACGCGA ATTATTTTTG ATGGCGTTCC 7020 

TATTGGTTAA AAAATGAGCT GATTTAACAA AAATTTAACG CGAATTrW CAAAATATTA 7080 

ACGTTTACAA TTTAAATATT TGCTTATACA ATCTTCCTGT TTTTGGGGOT TTTCTGATTA 7140 

TCAACCGGGG TACATATGAT TGACATGCTA GTTTTACGAT TACCGTTCAtX CGATTCTCTT 7200 

GTTTGCTCCA GACTCTCAGG CAATGACCTG ATAGCCTTTG TAGATCTCTC AAAAATAGCT 7260 

ACCCTCTCCG GCATTAATTT ATCAGCTAGA ACGGTTGAAT ATCATATTGA TGCTGATTTG 7320 

ACTGTCTCCG GCCTTTCTCA CCCTTTTGAA TCTTTACCTA CACATTACTC AGGCATTGCA 7380 
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TTTAAAATAT ATGAGGGTTC TAAAAATTTT TATCCTTGCG TTCAAATAAA GGCTTCTCCC 7440 
GCAAAAGTAT TACAGGGTCAy TAATGTTTTT GGTACAACCG ATTTAGCTTT ATGCTCTGAG 7500 
GCTTTATTGC TTAATTTTGC 'KAATTCTTTG CCTTGCCTGT ATGATTTATT GGATGTT 7557 
(2) INFORMATION FOR SEQ Yd NO: 5: 

(1) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 8118\base pairs 

(B) TYPE: nucleicVacld 

(C) STRANDEDNESS : \both 

(D) TOPOLOGY: circ&lar 



<xi) SEQUENCE DESCRIPTION: ^ 
AATGCTACTA CTATTAGTAG AATTGATp 
ATAGCTAAAC AGGTTATTGA CCATTTG© 



.^ID NO: 5: K 
Vaccttttcag CTCGCGCCCC aaatgaaaat 

ATCTA ATGGTCAAAC TAAATCTACT 



CGTTCGCAGA ATTGGGAATC AACTGTTACA TOGAATCAAA CTTCCAGACA CCGTACTTTA 
GTTGCATATT TAAAACATGT TGAGCTACAG CACCAGATTC AGCAATTAAG CTCTAAGCCA 
TCTGCAAAAA TGACCTCTTA TCAAAAGGAG CAATTAAAGG TACTCTCTAA TCCTGACCTG 
TTGGAGTTTG CTTCCGGTCT GGTTCGCTTT GAAGCTCGAA TTAAAACGCG ATATTTGAAG 
TCTTTCGGGC TTCCTCTTAA T CI 1 'IT 1 GAT GCAATCuGCT TTGCTTCTGA CTATAATAGT 
CAGGGTAAAG ACCTGATTTT TGATTTATGG TCATTCTOGT TTTCTGAACT GTTTAAAGCA 
TTTGAGGGGG ATTCAATGAA TATTTATGAC GATTCCGCAG TATTGGACGC TATCCAGTCT 
AAACATTTTA CTATTACCCC CTCTGGCAAA ACTTCTTTTgXcAAAAGCCTC TCGCTATTTT 
GGTTTTTATC GTCGTCTGGT AAACGAGGGT TATGATAGTG V GCT CTT AC TATGCCTCGT 
AATTCCTTTT GGCGTTATGT ATCTGCATTA GTTGAATGTG GTiATTCCTAA ATCTCAACTG 
ATGAATCTTT CTACCTGTAA TAATGTTGTT CCGTTAGTTC GTfTTATTAA CGTAGATTTT 
TCTTCCCAAC GTCCTGACTG GTATAATGAG CCAGTTCTTA AAATCGCATA AGGTAATTCA 
CAATGATTAA AGTTGAAATT AAACCATCTC AAGCCCAATT TACTAGTCGT TCTGGTGTTT 
CTCGTCAGGG CAAGCCTTAT TCACTGAATG AGCAGCTTTG TTACGT^GAT TTGGGTAATG 
AATATCCGGT TCTTGTCAAG ATTACTCTTG ATGAAGGTCA GCCAGCCtAT GCGCCTGGTC 
TGTACACCGT TCATCTGTCC TCTTTCAAAG TTGGTCAGTT CGGTTCCCTT ATGATTGACC 
GTCTGCGCCT CGTTCCGGCT AAGTAACATG GAGCAGGTCG CGGATTTCGA CACAATTTAT 
CAGGCGATGA TACAAATCTC CGTTGTACTT TGTTTCGCGC TTCGTATAAt\ CGCTGGGGGT 
CAAAGATGAG TGTTTTAGTG TATTCTTTCG CCTCTTTCGT TTTAGGTTGG VgCCTTCGTA 
GTGGCATTAC GTATTTTACC CGTTTAATGG AAACTTCCTC ATCAAAAAGT OTTTAGTCCT 
CAAAGCCTCT GTAGCCGTTG CTACCCTCGT TCCGATGGTG TCTTTCGCTG CtGAGGGTGA 
CGATCCCGCA AAAGCGGCCT TTAACTCCCT GCAAGCCTCA GCGACCGAAT ATkTCGGTTA 
TGCGTGGGCG ATGGTTGTTG TCATTGTCGG CGCAACTATC GGTATCAAGC TGtVtAAGAA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
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ATTCACCTCG AAAGCAAGCT GATAAACCGA TACAATTAAA GGCTCCTTTT GGAGCCTTTT 1560 

TTTTTGGAGA TTTTCAACg\ GAAAAAATTA TTATTCGCAA TTCCTTTAGT TGTTCCTTTC 1620 

TATTCTCACT CCGCTGAAAc\gTTGAAAGT TGTTTAGCAA AACCCCATAC AGAAAATTCA 1680 

TTTACTAACG TCTGGAAAGA CGACAAAACT TTAGATCGTT ACGCTAACTA TGAGGGTTGT 1740 

CTGTGGAATG CTACAGGCGT TGTAGTTTGT ACTGGTGACG AAACTCAGTG TTACGGTACA 1800 

TGGGTTCCTA TTGGGCTTGC TATCOCTGAA AATGAGGGTG GTGGCTCTGA GGGTGGCGGT 1860 

TCTGAGGGTG GCGGTTCTGA GGGTGGCGGT ACTAAACCTC CTGAGTACGG TGATACACCT 1920 

ATTCCGGGCT ATACTTATAT CAACCCTQTC GACGGCACTT ATCCGCCTGG TACTGAGCAA 1980 

AACCCCGCTA ATCCTAATCC TTCTCTTGAC GAGTCTCAGC CTCTTAATAC TTTCATGTTT 2040 

CAGAATAATA GGTTCCGAAA TAGGCAGCGgVcATTAACTG TTTATACGGG CACTGTTACT 2100 

CAAGGCACTG ACCCCGTTAA AACTTATTAC cWQiACTC CTGTATCATC AAAAGCCATG 2160 

TATGACGCTT ACTGGAACGG TAAATTCAGA SAOTGCGeTT^TCCATTCTGG CTTTAATGAA 2220 

GATCCATTCG TTTGTGAATA TCAAGGCCAA TCGTOXGACC TGCCTCAACC TCCTGTCAAT 2280 

GCTGGCGGCG GCTCTGGTGG TGGTTCTGGT GGCGGc\cTG AGGGTGGTGG CTCTGAGGGT 2340 

GGCGGTTCTG AGGGTGGCGG CTCTGAGCGA GGCGGTTc\g GTGGTGGCTC TGGTTCCGGT 2400 

GATTTTGATT ATGAAAAGAT GGCAAACGCT AATAAGGGGg\cTATGACCGA AAATGCCGAT 2460 

GAAAACGCGC TACAGTCTGA CGCTAAAGGC AAACTTGATT CTGTCGCTAC TGATTACGGT 2520 

GCTGCTATCG ATGGTTTCAT TGGTGACGTT TCCGGCCTTG CTAATGGTAA TGGTGCTACT 2580 

GGTGATTTTG CTGGCTCTAA TTCCCAAATG GCTCAAGTCG GTGAtGGTGA TAATTCACCT 2640 

TTAATGAATA ATTTCCGTCA ATATTTACCT TCCCTCCCTC AATCGOTTGA ATGTCGCCCT 2700 

TTTGTCTTTA GCGCTGGTAA ACCATATGAA TTTTCTATTC ATTGTGM^A AATAAACTTA 2760 

TTCCGTGGTG TCTTTGCGTT TCTTTTATAT GTTGCCACCT TTATGTATOT ATTTTCTACG 2820 

TTTGCTAACA TACTGCGTAA TAAGGAGTCT TAATCATGCC AGTTCTTTTgVgGTATTCCGT 2880 

TATTATTGCG TTTCCTCGGT TTCCTTCTGG TAACTTTGTT CGGCTATCTG OTTACTTTTC 2940 

TTAAAAAGGG CTTCGGTAAG ATAGCTATTG CTATTTCATT GTTTCTTGCT CTTATTATTG 3000 

GGCTTAACTC AATTCTTGTG GGTTATCTCT CTGATATTAG CGCTCAATTA CCCTSCTGACT 3060 

TTGTTCAGGG TGTTCAGTTA ATTCTCCCGT CTAATGCGCT TCCCTGTTTT TATGtVaTTC 3120 

TCTCTGTAAA GGCTGCTATT TTCATTTTTG ACGTTAAACA AAAAATCGTT TCTTAtVtGG 3180 

ATTGGGATAA ATAATATGGC TGTTTATTTT GTAACTGGCA AATTAGGCTC TGGAAAGACG 3240 

CTCGTTAGCG TTCGTAAGAT TCAGCATAAA ATTGTAGCTG GGTGCAAAAT AGCAACTAAT 3300 

CTTGATTTAA GGCTTCAAAA CCTCCCGCAA GTCGGGAGGT TCGCTAAAAC GCCTCGCGTlfe 3360 
CTTAGAATAC CGGATAAGCC TTCTATATCT GATTTGCTTG CTATTGGGCG CGGTAATGAT \ 3420 
TCCTACGATG AAAATAAAAA CGGCTTGCTT GTTCTCGATG AGTGCGGTAC TTGGTTTAAT \ 3480 

ACCCGTTCTT GGAATGATAA GGAAAGACAG CCGATTATTG ATTGGTTTCT ACATGCTCGT O?540 



t 60 

AAATTAGGAT GGGATArrAT IITICTIGTI CAGGACTTAT CTATTGTTGA TAAACAGGCG 3600 

CGTTCTGCAT TAGCTGAACA TGTTGTTTAT TGTCGTCGTC TGGACAGAAT TACTTTACCT 3660 

TTTGTCGGTA CTTTATATTfc TCTTATTACT GGCTCGAAAA TGCCTCTGCC TAAATTACAT 3720 

GTTGGCGTTG TTAAATATGgVgaTTCTCAA TTAAGCCCTA -CTGTTGAGCG TTGGCTTTAT 3780 

ACTGGTAAGA ATTTGTATAA OCCATATGAT ACTAAACAGG CTTTTTCTAG TAATTATGAT 3840 

TCCGGTGTTT ATTCTTATTT AACGCCTTAT TTATCACACG GTCGGTATTT CAAACCATTA 3900 

AATTTAGGTC AGAAGATGAA GCTTACTAAA ATATATTTGA AAAAGTTTTC ACGCGTTCTT 3960 

TGTCTTGCGA TTGGATTTGC ATCaGCATTT ACATATAGTT ATATAACCCA ACCTAAGCCG 4020 

GAGGTTAAAA AGGTAGTCTC TCAGACCTAT GATTTTGATA AATTCACTAT TGACTCTTCT 4080 

CAGCGTCTTA ATCTAAGCTA TCGCTaVgTT TTCAAGGATT CTAAGGGAAA ATTAATTAAT 4140 

AGCGACGATT TACAGAAGCA AGGTTArrc/cVcACATATA TTGATTTATG TACTGTTTCC 4200 

ATTAAAAAAG GTAATTCAAA TGAAATtAt AAATGTAATT AATTTTGTTT TCTTGATGTT 4260 

TGTTTCATCA TCTTCTTTTG CT CAG GT AAiTtGAAATG AAT AATTCGCCTC TGCGCGATTT 4320 

TGTAACTTGG TATTCAAAGC AATCAGGCGA VrCCGTTATT GTTTCTCCCG ATGTAAAAGG 4380 

TACTGTTACT GTATATTCAT CTGACGTTAA AJCCTGAAAAT CTACGCAATT TCTTTATTTC 4440 

TGTTTTACGT GCTAATAATT TTGATATGGT TGOTTCAATT CCTTCCATAA TTCAGAAGTA 4500 

TAATCCAAAC AATCAGGATT ATATTGATGA ATTGCCATCA TCTGATAATC AGGAATATCA 4560 

TGATAATTCC GCTCCTTCTG GTGGTTTCTT TGTTCCGCAA AATGATAATG TTACTCAAAC 4620 

TTTTAAAATT AATAACGTTC GGGCAAAGGA TTTAATACGA CTTCTCCAAT TGTTTGTAAA 4680 

GTCTAATACT TCTAAATCCT CAAATGTATT ATCTATTGAC GGCTCTAATC TATTAGTTGT 4740 

TAGTGCACCT AAAGATATTT TAGATAACCT TCCTCAATTC CTTTCTACTG TTGATTTCCC 4800 

AACTGACCAG ATATTGATTG AGGGTTTGAT ATTTGAGGTlT CAGCAAGCTG ATGCTTTAGA 4860 

TTTTTCATTT GCTGCTGGCT CTCAGCGTGG CACTGTTGcA GGCGGTGTTA ATACTGACCG 4920 

CCTCACCTCT GTTTTATCTT CTGCTGGTGG TTCGTTCGGT ATTTTTAATG GCCATGTTTT 4980 

ACGGCTATCA GTTCGCGCAT TAAAGACTAA TAGCCATTCA aVaATATTGT CTCTGCCACG 5040 

TATTCTTACG CTTTCAGCTC AGAACGGTTC TATCTCTGTT GGtCAGAATG TCCCTTTTAT 5100 

TACTGGTCGT GTGACTGGTG AATCTCCCAA TGTAAATAAT CCATTTCAGA CGATTGAGCG 5160 

TCAAAATGTA GGTATTTCCA TGAGCGTTTT TCCTGTTGCA ATGGCTGGCG GTAATATTGT 5220 

TCTGGATATT ACCAGCAAGC CCGATAGTTT GAGTTCTTCT ACTCAGGCAA GTGATCTTAT 5280 

TACTAATCAA AGAAGTATTG CTACAACGGT TAATTTGCGT GATGGAC^GA CTCTTTTACT 5340 

CGGTGGCCTC ACTGATTATA AAAACACTTC TCAAGATTCT GGCCTAcWr TCCTGTCTAA 5400 

AATCCCTTTA ATCGGCCTCC TGTTTAGCTC CCGCTCTGAT TCCAACGASG AAAGCACGTT 5460 

ATACGTGCTC GTCAAAGCAA CCATAGTACG CGCCCTGTAG CGGCGCATTA AGCGCGGCGG 5520 

GTGTGGTGGT TACGCGCAGC GTGACCGCTA CACTTGCCAG CGCCCTAGCG\ CCCGCTCCTT 5580 
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TCGCTTTCTT CCCTTCCTTT OTCGCCACGT TCGCGGGCTT TCCCCGTCAA GCTCTAAATC 5640 

GGGGGCTCCC TTTAGGGTTC CgVtTTAGTG CTTTACGGCA CCTCGACCCC AAAAAACTTG 5700 

ATTTGGGTGA TGGTTCACGT AGTGGGCCAT CGCCCTGATA GACGGTTTTT CGCCCTTTGA 5760 

CGTTGGAGTC CACGTTCTTT AATAiSTGGAC TCTTGTTCCA AACTGGAACA ACACTCAACC 5820 

CTATCTCGGG CTATTCTTTT GATTTATAAG GGATTTTGCC GATTTCGGAA CCACCATCAA 5880 

ACAGGATTTT CGCCTGCTGG GGCAAACCAG CGTGGACCGC TTGCTGCAAC TCTCTCAGC?G 5940 

CCAGGCGGTG AAGGGCAATC AGCTGTTGCC CGTCTCGCTG GTGAAAAGAA AAACCACCCT 6000 

GGCGCCCAAT ACGCAAACCG CCTCTCCCCG CGCGTTGGCC GATTCATTAA TGCAGCTGGC 6060 

ACGACAGGTT TCCCGACTGG AAAGCGGGCA GTGAGCGCAA CGCAAtTAAT GTGAGTTAGC 6120 

TCACTCATTA GGCACCCCAG GCTTTACAcAtTATGCTTCC GGCTCGTATG TTGTGTGGAA 6180 

TTGTGAGCGG ATAACAATTT CACACGCCAA (KJAGACAGTC ATAATGAAAT ACCTATTGCC 6240 

TACGGCAGCC GCTGGATTCT TAmCTCGe~^5CCCAACCA GCCATGGCCG AGCTCTTCCC 6300 

GCCATCTGAT GAGCAGTTGA AATCTGGAAC TGCCTCTGTT GTGTGCCTGC TGAATAACTT 6360 

CTATCCCAGA GAGGCCAAAG TACAGTGGAA GGTGGATAAC GCCCTCCAAT CGGGTAACTC 6420 

CCAGGAGAGT GTCACAGAGC AGGACAGCAA GGACXGCACC TACAGCCTCA GCAGCACCCT 6480 

GACGCTGAGC AAAGCAGACT ACGAGAAACA CAAAGTCTAC GCCTGCGAAG TCACCCATCA 6540 

GGGCCTGAGC TCGCCCGTCA CAAAGAGCTT CAACAG&GGA GAGTGTTCTA GAACGCGTCA 6600 

CTTGGCACTG GCCGTCGTTT TACAACGTCG TGACTGGSAA AACCCTGGCG TTACCCAAGC 6660 

TTTGTACATG GAGAAAATAA AGTGAAACAA AGCACTATTG CACTGGCACT CTTACCCTTA 6720 

CTGTTTACCC CTGTGGCAAA AGCCGCCTCC ACCAAGGGCfi CATCGGTCTT CCCCCTGGCA 6780 

CCCTCCTCCA AGAGCACCTC TGGGGGCACA GCGGCCCTGgVgCTGCCTGGT CAAGACTAAT 6840 

TCCCCGAACC GGTGACGGTG TCGTGGAACT CAGGCGCCCT GACCAGCGGC GTGCACACCT 6900 

TCCCGGCTGT CCTACAGTCC TCAGGACTCT ACTCCCTCAG CAGCGTGGTG ACCGTGCCCT 6960 

CCAGCAGCTT GGGCACCCAG ACCTACATCT GCAACGTGAA TCACAAGCCC AGCAACACCA 7020 

AGGTGGACAA GAAAGCAGAG CCCAAATCTT GTACTAGTGG ATCGTACCCG TACGACGTTC 7080 

CGGACTACGC TTCTTAGGCT GAAGGCGATG ACCCTGCTAA GGCTGCATTC AATAGTTTAC 7140 

AGGCAAGTGC TACTGAGTAC ATTGGCTACG CTTGGGCTAT GGTAGTAGTT ATAGTTGGTG 7200 

CTACCATAGG GATTAAATTA TTCAAAAAGT TTACGAGCAA GGCTTCxTAA GCAATAGCGA 7260 

AGAGGCCCGC ACCGATCGCC CTTCCCAACA GTTCCGCAGC CTGAATGGCG AATGGCGCTT 7320 

TGCCTGCTTT CCGGCACCAG AAGCGGTGCC GGAAAGCTGG CTCGAGTGCS ATCTTCCTGA 7380 

GGCCGATACG GTCGTCCTCC CCTCAAACTG GCAGATGCAC GGTTACGATgVcGCCCATCTA 7440 

CACCAACGTA ACCTATCCCA TTACGGTCAA TCCGCCGTTT GTTCCCACGG AGAATCCGAC 7500 

GGGTTGTTAC TCGCTCACAT TTAATGTTGA TGAAAGCTGG CTACAGGAAG GfiCAGACGCG 7560 

AATTATTTTT GATGGCGTTC CTATTGGTTA AAAAATGAGC TGATTTAACA AAAATTTAAC 7620 
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GCCAATTTTA ACAaVwTATT AACGTTTACA ATTTAAATAT TTGCTTATAC AATCTTCCTG 7680 

TTTTTGGGGC TTTTcVcATT ATCAACCGGG GTACATATGA TTGACATGCT AGTTTTACGA 7740 

TTACCGTTCA TCGATTCTCT TGTTTGCTCC AGACTCTCAG GCAATGACCT GATAGCCTTT 7800 

GTAGATCTCT CAAAAATAGC TACCCTCTCC GGCATTAATT TATCAGCTAG AACGGTTGAA 7860 

TATCATATTG ATGGTGATpT GACTGTCTCC GGCCTTTCTC ACCCTTTTGA ATCTTTACCT 7920 

ACACATTACT CAGCCATT&C ATTTAAAATA TATGAGGGTT CTAAAAATTT TTATCCTTCC 7980 

GTTGAAATAA AGGCTTCTCE CGCAAAAGTA TTACAGGGTC ATAATGTTTT TGGTACAACC 8040 

GATTTAGCTT TATGCTCTGa\gGCTTTATTG CTTAATTTTG CTAATTCTTT GCCTTGCCTG 8100 

TATGATTTAT TGGACGTT \ < 8118 

(2) INFORMATION FOR SE(AlD NO: 6: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 22 base^pairs 

(B) TYPE: nu<*i*ie-~acid 

(C) STRANDEDNES S\ single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME/KEY: aisc Yif f erence 

(B) LOCATION: replace (5, ■■) 
(D) OTHER INFORMATION: /note- 

OF G AND C" 



S REPRESENTS EQUAL MIXTURE 



(ix) FEATURE: 

(A) NAME/KEY: nisc difference 

(B) LOCATION: replace (6\ •") 
(D) OTHER INFORMATION : /note- *M REPRESENTS EQUAL MIXTURE 

OF A AND C" 

(ix) FEATURE: 

(A) NAME /KEY: misc differeWe 

(B) LOCATION: replace(8, "\) 
(D) OTHER INFORMATION: /notfe 

OF A AND G" 



'R REPRESENTS EQUAL MIXTURE 



(ix) FEATURE: 

(A) NAME/KEY: aisc differenceX 

(B) LOCATION: replace<ll, ••) y 
(D) OTHER INFORMATION: /note- 

OF G AND T" 

(ix) FEATURE: 

(A) NAME/KEY: aisc difference 

(B) LOCATION: replace (20, •■) 
(D) OTHER INFORMATION: /note- "V 1 

OF A AND r 



I REPRESENTS EQUAL MIXTURE 



E PRESENTS EQUAL MIXTURE 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
AGGTSMARCT KCTCGAGTCW GG 
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(2) INFORHATIONVfOR SEQ ID NO: 7: 

<1) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 22 base pairs 

(B) TYBE: nucleic acid 

(C) STRANDEDNESS ; single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
AGGTCCAGCT GCTCGAGTCT GG 
(2) INFORMATION FOR SIEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH/122\ base pairs 

(B) TYPE: Aufcleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY^ linear 

(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
AGGTCCAGCT GCTCGAGTCA GG \ 
(2) INFORMATION FOR SEQ iA NO: 9: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 hdse pairs 

(B) TYPE: nucleic\acid 

(C) STRANDEDNESS: V ingle 

(D) TOPOLOGY : linek 



(xi) SEQUENCE DESCRIPTION : \SEQ ID NO: 9: 

AGGTCCAGCT TCTCGAGTCT GG 

(2) INFORMATION FOR SEQ ID NO:10:\ 

(i) SEQUENCE CHARACTERISTICS :\ 
(A) LENGTH: 22 base pair A 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NX?: 10 
AGGTCCAGCT TCTCGAGTCA GG 
(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: lln ar 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
AGGTCCAACT GCTCGACTCT GG 
(2) INFORMATION FOIL SEQ ID NO: 12: 

(I) SEQUENCE CHARACTERISTICS : 

(A) LENGTH\ 22 base pairs 

(B) TYPE: riucleic acid 

(C) STRANDEQNESS : single 

(D) TOPOLOGY^ linear 

(xi) SEQUENCE DESCRlWlON: SEQ ID NO: 12: 
AGGTCCAACT GCTCGAGTCA GG 
(2) INFORMATION FOR SEQ IdW>:13: 

(I) SEQUENCE CHARACTERISTIC 

(A) LENGTH: 22 basil pairs 

(B) TYPE: nucleic aVld\ 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear*- ^ 

(xi) SEQUENCE DESCRIPTION: SEp ID NO: 13: 
AGGTCCAACT TCTCGAGTCT GG 
(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID No\l4: 

AGGTCCAACT TCTCGAGTCA GG 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS : 
> (A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/KEY: aisc difference 

(B) LOCATION: replace (5 6, ■■) 
(D) OTHER INFORMATION : /note- "N-INOSIN 

(ix) FEATURE: 

(A) NAME/KEY: misc difference 

(B) LOCATION: repla*ce(8, •■) 
(D) OTHER INFORMATION: /note- "N-INOSINE" 
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(ix) FEAT 
(A) 
(B) 
(D) 



/KEY: misc difference 
lTION: replace(ll, ■■) 

OTNER INFORMATION: /note- "N-INOSINE" 



(ix) FEATURE: 

(A) NAMeVkEY: misc difference 

(B) LOCATION: repllce(20, ■") 
(D) 0THER\INF0RMATI0N: /note- "V REPRESENTS EQUAL MIXTURE 

OF \ AND T" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
AGGTNNANCT NCTCGAGTCW g\ 
(2) INFORMATION FOR SEQ \d NO: 16: 



(I) SEQUENCE CHARACTi 

(A) LENGTH: 38 

(B) TYPE: nuclt_ 

(C) STRANDEDNESS; 

(D) TOPOLOGY: 



JSTICS: 
e pairs 
acid 

le 



<xi) SEQUENCE DESCRIPTION :\ SEQ ID NO: 16: 
CTATTAACTA GTAACGGTAA CAGTGGTGCfc TTGCCCCA 
(2) INFORMATION FOR SEQ ID NO:1t\ 

(i) SEQUENCE CHARACTERISTICS \ 

(A) LENGTH: 30 base pair\ 

(B) TYPE: nucleic acid ^ 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID jb:17: 

AGGCTTACTA GTACAATCCC TGGGCACAAT 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 32 base pairs 
(fi) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
CCAGTTCCGA GCTCCTTGTG ACTCAGGAAT CT 
(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 bas pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: lin ar 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
CCAGTTCCGA CCTCCTGTTG ACGCAGCCGC CC 
(2) INFORMATION TOR SEQ ID NO: 20: 

(i) SEQUENCER CHARACTERISTICS : 

(A) LENGTH: 32 base pairs 

(B) TYPE\ nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOL&GY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
CCAGTTCCGA GCTCGTGCTC AfcCCAGTCTC CA 
(2) INFORMATION FOR SEQ \p NO: 21: 

(1) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 32 b&seVirs 

(B) TYPE: nucleifc\acid 

(C) STRANDEBNESsVVingle 

(D) TOPOLOGY: lineaT 

(xi) SEQUENCE DESCRIPTION: \eq ID NO: 21: 
CCAGTTCCGA GCTCCAGATG ACCCAGTCTC\ 
(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: » 

(A) LENGTH: 32 base pairs) 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID n\ 

CCAGATGTGA GCTCGTGATG ACCCAGACTC CA 

(2) INFORMATION FOR SEQ ID NO: 23: 

(1) SEQUENCE CHARACTERISTICS: 
, (A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 
CCAGATGTGA GCTCGTGATG ACCCAGTCTC CA 
(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: sinel 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 
CCAGTTCCGA GCTCGTGAT<S ACACAGTCTC CA 
(2) INFORMATION FOR sfo ID NO: 25: 

<1) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 32\base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : single 
(D) TOPOLOGY: llbear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 
GCAGCATTCT AGAGTTTCAG CTCCAGCTTG CC 
(2) INFORMATION FOR SEQ ID NO:i6: 

(i) SEQUENCE CHARACTERISTIC 

(A) LENGTH: 34 base/pa\rs 

(B) TYPE: nucleic afeidA. 

(C) STRAND EDNESS : sinell 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ IDyNO:26: 
GCGCCGTCTA GAATTAACAC TCATTCCTGT TGAA 
(2) INFORMATION FOR SEQ ID NO: 27: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 
GATCCTAGCC TGAAGGCGAT GACCCTGCTA AGGCTGC 
(2) INFORMATION FOR SEQ ID NO: 28: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
ATTCAATAGT TTACAGGCAA GTGCTACTGA GTACA 



35 
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(2) INFORMATI0N\FOR SEQ ID NO: 29: 

(i) SEQUENCER CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE\ nucleic acid 

(C) STRANBEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
TTGGCTACGC TTGGGCTATG GT5AGTAGTTA TAGTT 
(2) INFORMATION FOR SEQ TO NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 ba\se pairs 

(B) TYPE: nucleic\acid 

(C) STRANDEDNESS' \single 

(D) TOPOLOGY: 1X^ A 



(xi) SEQUENCE DESCRIPTION: $EQ ID NO: 30: 
GGTGCTACCA TAGGGATTAA ATTATTCAAA VAGTT 
(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3\ 
TACGAGCAAG GCTTCTTA 
(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) S TRAND EDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 32: 
AGCTTAAGAA GCCTTGCTCG TAAACTTTTT GAATAATTT 
(2) INFORMATION FOR SEQ ID NO: 33: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE" DESCRIPTION: SEQ ID NO:33: 
AATCCCTATG GTAGCAOCAA CTATAACTAC TACCAT 
(2) INFORMATION F0R.\sEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH:\ 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY Alinear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 
AGCCCAAGCG TAGCCAATGT ACTCAVrAGC ACTTG 
(2) INFORMATION FOR SEQ ID N&:35: 

(i) SEQUENCE CHARACTERISTICS. 

(A) LENGTH: 34 base Salrs 

(B) TYPE: nucleic ac2T 

(C) STRANDEDNESS: sih 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ^ 
CCTGTAAACT ATTGAATGCA GCCTTAGCAG GG 
(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:36\ 

ATCGCCTTCA GCCTAG 

(2) INFORMATION FOR SEQ ID NO: 37: 

(1) SEQUENCE CHARACTERISTICS : 
^ (A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 
CATTTTTGCA GATGGCTTAG A 
(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 
TAGCATTAAC GTCCAATA 
(2) INFORMATION W SEQ ID NO: 39: 

(i) SEQUENCE ^CHARACTERISTICS ; 

(A) LENGTH: 26 base pairs 

(B) TYPE:Wcleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 
ATATATTTTA GTAAGCTTCA TCTTCT 
(2) INFORMATION FOR SEQ IoVrr^O: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 23 foaseY 

(B) TYPE: nuclei. 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ\ID NO: 40: 
GACAAAGAAC GCGTGAAAAC TTT 
(2) INFORMATION FOR SEQ ID N0:41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41^ 
GCGGGCCTCT TCGCTATTGC TTAAGAAGCC TTGCT 
(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 43 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:42: 
AAACGACGGC CAGTGCCAAG TGACGCGTGT GAAATTGTTA TCC 



(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) l*NGTH: 43 base pairs 

(B) iit"E : nucleic acid 

(C) STRANDEDNESS : single 

(D) TOKOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:43: 
ggcgaaaggg aattctgcaa GGCGATTAAG CTTGGGTAAC GCC 
(2) INFORMATION FOR SBD ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3& base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: ll 

(xi) SEQUENCE DESCRIP*ION-r'SEQ ID NO: 44: 
GGCGTTACCC AAGCTTTGTA CATGGAGAAA ATAAAG 
<2) INFORMATION FOR SEQ ID N0:^5 

(i) SEQUENCE CHARACTERISTIC 

(A) LENGTH: 42 base pafts 

(B) TYPE: nucleic acid \ 

(C) STRANDEDNESS : single\ 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID > 
TGAAACAAAG CACTATTGCA CTGGCACTCT TACCG 
(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:46: 

TACTCTTTAC CCCTGTGACA AAAGCCCCCC AGGTCCAGCT GC 

(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 44 base pairs 
<B) TYPE: nucl 1c acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(xl) SEQUENCE INSCRIPTION: SEQ ID NO: 47; 
TCCAGTCAGG CCTATTGTGC CCACGGATTG TACTAGTGGA TCCG 
(2) INFORMATION FOR SEQ ID NO: 48: 

(I) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 3fl base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: ^Linear 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO:48: 
TGGCGAAAGG GAATTCGGAT CCACTftGTAC AATCCCTG 
(2) INFORMATION FOR SEQ ID N5):49: 

(i) SEQUENCE CHARACTERISE 

(A) LENGTH: 42 base /n _ 

(B) TYPE: nucleic adfld 

(C) STRANDEDNESS : 

(D) TOPOLOGY : linear' 

(xi) SEQUENCE DESCRIPTION: SEQ \d NO:49: 
GGCACAATAG GCCTGACTCG AGCAGCTCGA CCAG£GCGGC TT 
(2) INFORMATION FOR SEQ - ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50 

TTGTCACAGG GGTAAACAGT AACGGTAACG GTAAGTGTGC^ 

(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS : 
, (A) LENGTH: 42 baa* pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: .ingle 

(D) TOPOLOGY: linear 

(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 
GTGCAATAGT GCTTTGTTTC ACTTTATTTT CTCCATGTAC AA 
(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: singl 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:52: 
TAACGGTAAG AGTGCCAGTG C 
(2) INFORMATION FOR SEQ ID NO: 53: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH :\ 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : \ 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 
CACCTTCATG AATTCGGCAA GGAGaVaGTC AT 
(2) INFORMATION FOR SEQ ID 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 22 bfcaeYpairs 

(B) TYPE: nuclei 
<C) STRANDEDNESS. 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: 
AATTCGCCAA GGAGACAGTC AT 
(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5^5 

AATGAAATAC CTATTGCCTA CGGCAGCCGC TGGATTGTT 

(2) INFORMATION FOR SEQ ID NO: 56: 

(1) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 39 base pairs 
<B) TYPE: nucleic acid 
(C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 
ATTACTCGCT GCCCAACCAG CCATCCCCGA GCTCGTGAT 
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(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE VhaRACTERISTICS : 

(A) LENGTH: 39 base pairs 

(B) TYPE: Yucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOG^: linear 



(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 
GACCCAGACT CCAGATATCC AA&AGGAATC AGTGTTAAT 
(2) INFORMATION FOR SEQ ill NO: 58: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 13 bake pairs 

(B) TYPE: nucleicXacid 

(C) STRANDEDNESS : Wngle 

(D) TOPOLOGY: line2 

(xi) SEQUENCE DESCRIPTION' 
TCTAGAACGC GTC 

(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: \ 

(A) LENGTH: 45 base pairs \ 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID No\59: 
TTCAGGTTGA AGCTTACGCG TTCTAGAATT AACACTCAtT CCTGT 
(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

TGGATATCTG GAGTCTGGGT CATCACGAGC TCGGCCATG 

(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 39 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: singl 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 
GCTGGTTGGC CAGCGAGTAA TAACAATCCA GCGGCTGCC 
(2) INFORMATION FoV SEQ ID NO:62: 

(I) SEQUENCE CHARACTERISTICS: 

(A) LENGTHS 37 base pairs 

(B) TYPE: nVicleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGYJy linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 
GTAGGCAATA GGTATTTCAT TATGA.CTGTC CTTGGCG 
(2) INFORMATION FOR SEQ ID NO: 63 : 

(i) SEQUENCE CHARACTERISE; 

(A) LENGTH: 30 bas 

(B) TYPE: nucleic 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linet 

(xi) SEQUENCE DESCRIPTION: SEQ ^ NO: 63: 
TGACTGTCTC CTTGGCGTGT GAAATTGTTA 
(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 

TAACACTCAT TCCGGATGGA ATTCTGGAGT CTGGGT 

(2) INFORMATION FOR SEQ ID HO: 65: 

(i) SEQUENCE CHARACTERISTICS: 
> (A) LENGTH: 24 but pairs 
(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 
GCCAGTGCCA AGTGACGCCT TCTA 
(2) INFORMATION FOR SEQ ID NO: 66: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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<xl) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 
ATATATTTTA GTAAGCTTCA TCTTCT 
(2) INFORMATION FO\ SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH\ 23 base pairs 

(B) TYPE: nWlelc acid 

(C) STRANDEONESS : single 

(D) TOPOLOGY^ linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 
GACAAAGAAC GCGTGAAAAC TTT ' 
(2) INFORMATION FOR SEQ ID 1 

(i) SEQUENCE CHARACTE 

(A) LENGTH: 76 bL 

(B) TYPE: nucleic 

(C) STRANDEDNESS: sik 

(D) TOPOLOGY : linear \ 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 
CTGAACCTGT CTGGGACCAC AGTTGATGCT AilAGGATCAG ATCTAGAATT CATTTAGAGA 
CTGGCCTGGC TTCTGC 
(2) INFORMATION FOR SEQ ID NO: 69: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 80 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:69\ 
TCGACCGTTG GTAGGAATAA TGCAATTAAT GGAGTAGCTC T.AAATTCAGA ATTCATCTAC 
ACCCAGTCCA TCCAGTAGCT 
(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 baa« pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 
GGTAAACAGT AACGGTAAGA GTGCCAG 
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(2) INFORMATION FtJR SEQ ID NO: 71: 

(1) SEQUENCE CHARACTERISTICS : 

(A) LENGTHy: 54 base pairs 

(B) TYPE: Aucleic acid 

(C) STRANDEDNESS : single 
(D> T0P0L0CY\ linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 

CGCCTTCAGC CTAAGAAGCG TAGTCCGGAA CGTCGTACGG GTAGGATCCA CTAG 

(2) INFORMATION FOR SEQ ID\NO:72: 

(i) SEQUENCE CHARACTER^ 
(A) LENGTH: 41 bad 
(B> TYPE: nucleic^ 

(C) STRANDEDNESS : Single 

(D) TOPOLOGY: lineafc 

(xi) SEQUENCE DESCRIPTIONCJffiQ^fD NO: 72: 
CACCGGTTCG GGGAATTAGT CTTGACCAGG yAGCCCAGGG C 
(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS : , 

(A) LENGTH: 51 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID No\ 
ATTCCACACA TTATACGAGC CGGAAGCATA AAGTCTCA^G CCTGGGGTGC C 
(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 
CTGCTCATCA GATGGCGGGA AGAGCTCGGC CATGGCTGGT TG 
(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: singl 

(D) TOPOLOGY: linear 



<XD SEQUENCE DESCRlmL SEp a " 

GAACACACTG ACGGAGGGGG CGAgAL- 

b CCAGQteGC CATGGCTGGT TG 



